-
Notifications
You must be signed in to change notification settings - Fork 181
Use Calcite's validation system for type checking & coercion #4892
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
📝 WalkthroughSummary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings. WalkthroughAdds a Calcite-based validation and implicit type coercion pipeline for PPL by round-tripping RelNode↔SqlNode. Introduces PPL-specific validator, type coercion/rules, convertlet table, dialect updates, converters/shuttles, and utilities. Refactors type utilities/factory, operand checkers, and UDF metadata. Simplifies function registries; removes legacy type checker/coercion utils. Updates extensive tests/resources and docs. Changes
Sequence Diagram(s)sequenceDiagram
participant P as QueryService
participant R as RelNode (PPL plan)
participant RS as PplRelToSqlNodeConverter
participant V as PplValidator (Calcite)
participant SR as PplSqlToRelConverter
participant O as Optimizer
P->>P: executeWithCalcite(...)
P->>R: build initial plan
P->>RS: Rel→Sql conversion (dialect-aware)
RS-->>P: SqlNode
P->>V: validate(SqlNode) with PplTypeCoercion/Convertlets
V-->>P: validated SqlNode (with implicit casts)
P->>SR: Sql→Rel conversion (semi/anti joins fixup)
SR-->>P: validated RelNode
P->>O: optimize(validated RelNode)
O-->>P: optimized plan
P-->>P: execute
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Comment |
e085f81 to
fc6dd27
Compare
…checking Signed-off-by: Yuanchun Shen <[email protected]> # Conflicts: # core/src/main/java/org/opensearch/sql/executor/QueryService.java # Conflicts: # core/src/main/java/org/opensearch/sql/executor/QueryService.java
Signed-off-by: Yuanchun Shen <[email protected]>
Signed-off-by: Yuanchun Shen <[email protected]>
… logics Signed-off-by: Yuanchun Shen <[email protected]>
Signed-off-by: Yuanchun Shen <[email protected]>
Signed-off-by: Yuanchun Shen <[email protected]>
Signed-off-by: Yuanchun Shen <[email protected]>
- 2 more ITs passed in PPLBuiltinFunctionIT Signed-off-by: Yuanchun Shen <[email protected]> # Conflicts: # core/src/main/java/org/opensearch/sql/expression/function/PPLFuncImpTable.java
- this fix testRand, where desrialization of sarg does not restore its type - todo: update the toRex in ExtendedRelJson to the align with the latest version Signed-off-by: Yuanchun Shen <[email protected]>
…estamp; (time, timestamp) -> timestamp (1240/1599) Signed-off-by: Yuanchun Shen <[email protected]>
…2/1872) Signed-off-by: Yuanchun Shen <[email protected]>
- allow type cast - rewrite call to sql compare to custom ip comapre Signed-off-by: Yuanchun Shen <[email protected]> # Conflicts: # core/src/main/java/org/opensearch/sql/expression/function/PPLBuiltinOperators.java
Signed-off-by: Yuanchun Shen <[email protected]>
Signed-off-by: Yuanchun Shen <[email protected]>
…1356/1599 | 1476/1915) Signed-off-by: Yuanchun Shen <[email protected]>
…d in mvindex's implementation (1580/2015) Signed-off-by: Yuanchun Shen <[email protected]>
…iting (1579/2015) Signed-off-by: Yuanchun Shen <[email protected]>
…pe hint Signed-off-by: Yuanchun Shen <[email protected]>
Signed-off-by: Yuanchun Shen <[email protected]>
Signed-off-by: Yuanchun Shen <[email protected]>
…e inference (1701/2015) Signed-off-by: Yuanchun Shen <[email protected]>
| + " LogicalTableScan(table=[[scott, EMP]])\n"; | ||
| verifyLogical(root, expectedLogical); | ||
|
|
||
| String expectedSparkSql = | ||
| "SELECT `EMPNO`, `COMM`, COALESCE(`COMM`, `EMPNO`, 'fallback') `result`\n" | ||
| "SELECT `EMPNO`, `COMM`, COALESCE(`COMM`, `EMPNO`) `result`\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to check: is this optimized because EMPNO is nonnull?
the rest optimization seems reasonable.
Signed-off-by: Yuanchun Shen <[email protected]>
Signed-off-by: Yuanchun Shen <[email protected]>
Signed-off-by: Yuanchun Shen <[email protected]>
Signed-off-by: Yuanchun Shen <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (4)
integ-test/src/test/resources/expectedOutput/calcite/clickbench/q38.yaml (1)
10-11: Physical plan uses SCRIPT-based pushdown with serialized expressions.The physical plan now uses script-based pushdowns containing base64-encoded RexNode expressions for the SAFE_CAST comparisons. While this aligns with the PR's architecture changes, the opaque base64 payloads reduce the readability and maintainability of this expected output file.
Consider documenting (in a comment or companion file) what these serialized scripts represent, or providing a utility to decode them for debugging purposes. This would help future maintainers verify correctness when the expected output changes.
integ-test/src/test/resources/expectedOutput/calcite/clickbench/q39.yaml (1)
13-13: Recommend verifying SCRIPT pushdown correctness and performance.The physical plan now uses SCRIPT-based pushdowns with serialized, base64-encoded operator trees. This raises several concerns:
Maintainability: The encoded script payloads are not human-readable, making debugging pushdown issues difficult.
Performance: OpenSearch script-based filters are typically slower than native filters. Three separate script clauses for SAFE_CAST operations may impact query performance, especially on large datasets.
Correctness: The test expectation validates plan shape but not execution results. Ensure:
- The actual test verifies query results match expected values
- Edge cases (malformatted data, NULL handling) are tested
- Timezone handling in EXPR_TIMESTAMP VARCHAR is correct (note the UTC "Z" suffix in the JSON range query)
Consider:
- Adding integration tests that validate query results, not just plan shape
- Benchmarking query performance compared to the previous FILTER-based approach
- Verifying the script execution produces correct results with representative data
Based on learnings, test SQL generation and optimization paths for Calcite integration changes.
integ-test/src/test/resources/expectedOutput/calcite/clickbench/q30.yaml (1)
1-9: Ensure test coverage for validation pipeline edge cases across the suite.This test file correctly follows the expected YAML format and represents a realistic clickbench query with a high field count (90 fields). As per the coding guidelines for test resources, verify that the broader test suite covers edge cases and boundary conditions for the new Calcite validation pipeline, such as:
- Queries with various field index patterns and ranges
- Type coercion scenarios (UDTs, cross-datetime comparisons, SAFE_CAST)
- Preservation of hints, sort orders, and null directions across RelNode↔SqlNode conversion
- Operator overload resolution (IP comparator, ATAN, ADD ambiguity)
Based on learnings: Test SQL generation and optimization paths for Calcite integration changes.
integ-test/src/test/resources/expectedOutput/calcite/explain_join_with_field_list_max_option.yaml (1)
1-20: Consider additional test cases for edge cases.This test validates the max=1 case with inner join and proper pushdown optimization. To improve test coverage for the max option feature, consider adding test cases for:
- Different max values (e.g., max=2, max=10)
- Join without max option to validate baseline behavior
- Edge case where right side has no matches (empty result)
- Comparing against the SEMI/ANTI join behavior where max has no effect (as noted in learnings)
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (71)
core/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.javainteg-test/src/test/resources/expectedOutput/calcite/agg_composite_date_range_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/composite_date_histogram_daily.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/composite_terms.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/composite_terms_keyword.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/date_histogram_minute_agg.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/dedup_metrics_size_field.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/multi_terms_keyword.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/terms_significant_1.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/terms_significant_2.yamlinteg-test/src/test/resources/expectedOutput/calcite/chart_multiple_group_keys.yamlinteg-test/src/test/resources/expectedOutput/calcite/chart_null_str.yamlinteg-test/src/test/resources/expectedOutput/calcite/chart_timestamp_span_and_category.yamlinteg-test/src/test/resources/expectedOutput/calcite/chart_use_other.yamlinteg-test/src/test/resources/expectedOutput/calcite/chart_with_integer_span.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q19.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q2.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q28.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q29.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q30.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q37.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q38.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q39.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q40.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q41.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q41_alternative.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q42.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q43.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q8.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_agg_paginating_join4.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_agg_sort_on_measure_multi_buckets_not_pushed.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_agg_with_script.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_chart_multi_group_key.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_chart_single_group_key.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_complex_dedup.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_complex_sort_expr_pushdown_for_smj_max_option.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_complex_sort_expr_pushdown_for_smj_w_max_option.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_count_agg_push5.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_count_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_complex1.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_complex2.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_complex3.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_complex4.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr3.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr4.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr4_alternative.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_keepempty_false_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_keepempty_false_pushdown.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_pushdown.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_rename.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_with_expr3.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_with_expr3_alternative.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_with_expr4.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_with_expr4_alternative.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_exists_correlated_subquery.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_exists_uncorrelated_subquery.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_filter_agg_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_filter_with_search.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_filter_with_search_call.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_in_correlated_subquery.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_in_uncorrelated_subquery.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_join_with_criteria_max_option.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_join_with_field_list_max_option.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_join_with_fields_max_option.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_limit_agg_pushdown3.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_limit_agg_pushdown5.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_limit_agg_pushdown_bucket_nullable1.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_multisearch_timestamp.yaml
✅ Files skipped from review due to trivial changes (7)
- integ-test/src/test/resources/expectedOutput/calcite/explain.yaml
- integ-test/src/test/resources/expectedOutput/calcite/explain_filter_with_search_call.yaml
- integ-test/src/test/resources/expectedOutput/calcite/explain_chart_single_group_key.yaml
- integ-test/src/test/resources/expectedOutput/calcite/explain_chart_multi_group_key.yaml
- integ-test/src/test/resources/expectedOutput/calcite/explain_count_push.yaml
- integ-test/src/test/resources/expectedOutput/calcite/explain_dedup_rename.yaml
- integ-test/src/test/resources/expectedOutput/calcite/explain_limit_agg_pushdown3.yaml
🚧 Files skipped from review as they are similar to previous changes (27)
- integ-test/src/test/resources/expectedOutput/calcite/explain_dedup_with_expr4.yaml
- integ-test/src/test/resources/expectedOutput/calcite/explain_filter_agg_push.yaml
- integ-test/src/test/resources/expectedOutput/calcite/clickbench/q19.yaml
- integ-test/src/test/resources/expectedOutput/calcite/explain_dedup_keepempty_false_push.yaml
- integ-test/src/test/resources/expectedOutput/calcite/explain_dedup_complex2.yaml
- integ-test/src/test/resources/expectedOutput/calcite/explain_count_agg_push5.yaml
- integ-test/src/test/resources/expectedOutput/calcite/explain_filter_with_search.yaml
- integ-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr3.yaml
- integ-test/src/test/resources/expectedOutput/calcite/explain_dedup_complex3.yaml
- integ-test/src/test/resources/expectedOutput/calcite/explain_in_uncorrelated_subquery.yaml
- integ-test/src/test/resources/expectedOutput/calcite/big5/date_histogram_minute_agg.yaml
- integ-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr4.yaml
- integ-test/src/test/resources/expectedOutput/calcite/explain_join_with_fields_max_option.yaml
- integ-test/src/test/resources/expectedOutput/calcite/clickbench/q8.yaml
- integ-test/src/test/resources/expectedOutput/calcite/clickbench/q43.yaml
- integ-test/src/test/resources/expectedOutput/calcite/explain_dedup_pushdown.yaml
- integ-test/src/test/resources/expectedOutput/calcite/explain_exists_uncorrelated_subquery.yaml
- integ-test/src/test/resources/expectedOutput/calcite/big5/terms_significant_2.yaml
- integ-test/src/test/resources/expectedOutput/calcite/clickbench/q40.yaml
- integ-test/src/test/resources/expectedOutput/calcite/explain_dedup_push.yaml
- integ-test/src/test/resources/expectedOutput/calcite/big5/composite_date_histogram_daily.yaml
- integ-test/src/test/resources/expectedOutput/calcite/explain_limit_agg_pushdown5.yaml
- integ-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr4_alternative.yaml
- integ-test/src/test/resources/expectedOutput/calcite/big5/composite_terms.yaml
- integ-test/src/test/resources/expectedOutput/calcite/big5/composite_terms_keyword.yaml
- integ-test/src/test/resources/expectedOutput/calcite/explain_agg_paginating_join4.yaml
- integ-test/src/test/resources/expectedOutput/calcite/explain_join_with_criteria_max_option.yaml
🧰 Additional context used
📓 Path-based instructions (4)
integ-test/src/test/resources/**/*
⚙️ CodeRabbit configuration file
integ-test/src/test/resources/**/*: - Verify test data is realistic and representative
- Check data format matches expected schema
- Ensure test data covers edge cases and boundary conditions
Files:
integ-test/src/test/resources/expectedOutput/calcite/clickbench/q38.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_with_expr4_alternative.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_complex1.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q28.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q42.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q39.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_agg_sort_on_measure_multi_buckets_not_pushed.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_limit_agg_pushdown_bucket_nullable1.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q29.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q37.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/multi_terms_keyword.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_complex_sort_expr_pushdown_for_smj_max_option.yamlinteg-test/src/test/resources/expectedOutput/calcite/chart_null_str.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_multisearch_timestamp.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_complex_dedup.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q41.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q30.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_keepempty_false_pushdown.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_join_with_field_list_max_option.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_with_expr3_alternative.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q2.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_complex4.yamlinteg-test/src/test/resources/expectedOutput/calcite/chart_use_other.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/dedup_metrics_size_field.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_complex_sort_expr_pushdown_for_smj_w_max_option.yamlinteg-test/src/test/resources/expectedOutput/calcite/chart_timestamp_span_and_category.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_with_expr3.yamlinteg-test/src/test/resources/expectedOutput/calcite/agg_composite_date_range_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q41_alternative.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_exists_correlated_subquery.yamlinteg-test/src/test/resources/expectedOutput/calcite/chart_multiple_group_keys.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/terms_significant_1.yamlinteg-test/src/test/resources/expectedOutput/calcite/chart_with_integer_span.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_agg_with_script.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_in_correlated_subquery.yaml
**/*.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
**/*.java: UsePascalCasefor class names (e.g.,QueryExecutor)
UsecamelCasefor method and variable names (e.g.,executeQuery)
UseUPPER_SNAKE_CASEfor constants (e.g.,MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
PreferOptional<T>for nullable returns in Java
Avoid unnecessary object creation in loops
UseStringBuilderfor string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code
Files:
core/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.java
⚙️ CodeRabbit configuration file
**/*.java: - Flag methods >50 lines as potentially too complex - suggest refactoring
- Flag classes >500 lines as needing organization review
- Check for dead code, unused imports, and unused variables
- Identify code reuse opportunities across similar implementations
- Assess holistic maintainability - is code easy to understand and modify?
- Flag code that appears AI-generated without sufficient human review
- Verify Java naming conventions (PascalCase for classes, camelCase for methods/variables)
- Check for proper JavaDoc on public classes and methods
- Flag redundant comments that restate obvious code
- Ensure proper error handling with specific exception types
- Check for Optional usage instead of null returns
- Validate proper use of try-with-resources for resource management
Files:
core/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.java
core/src/main/java/**/*.java
⚙️ CodeRabbit configuration file
core/src/main/java/**/*.java: - New functions MUST have unit tests in the same commit
Files:
core/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.java
**/calcite/**/*.java
⚙️ CodeRabbit configuration file
**/calcite/**/*.java: - Follow existing Calcite integration patterns
- Verify RelNode visitor implementations are complete
- Check RexNode handling follows project conventions
- Validate SQL generation is correct and optimized
- Ensure Calcite version compatibility
- Follow existing patterns in CalciteRelNodeVisitor and CalciteRexNodeVisitor
- Document any Calcite-specific workarounds
- Test compatibility with Calcite version constraints
Files:
core/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.java
🧠 Learnings (6)
📓 Common learnings
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Follow existing patterns in `CalciteRelNodeVisitor` and `CalciteRexNodeVisitor` for Calcite integration
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes
Applied to files:
integ-test/src/test/resources/expectedOutput/calcite/clickbench/q38.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_with_expr4_alternative.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_complex1.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q28.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q42.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q39.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_agg_sort_on_measure_multi_buckets_not_pushed.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_limit_agg_pushdown_bucket_nullable1.yamlcore/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.javainteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q29.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q37.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/multi_terms_keyword.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_complex_sort_expr_pushdown_for_smj_max_option.yamlinteg-test/src/test/resources/expectedOutput/calcite/chart_null_str.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_multisearch_timestamp.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_complex_dedup.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q41.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q30.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_keepempty_false_pushdown.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_join_with_field_list_max_option.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_with_expr3_alternative.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q2.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_complex4.yamlinteg-test/src/test/resources/expectedOutput/calcite/chart_use_other.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/dedup_metrics_size_field.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_complex_sort_expr_pushdown_for_smj_w_max_option.yamlinteg-test/src/test/resources/expectedOutput/calcite/chart_timestamp_span_and_category.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_with_expr3.yamlinteg-test/src/test/resources/expectedOutput/calcite/agg_composite_date_range_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q41_alternative.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_exists_correlated_subquery.yamlinteg-test/src/test/resources/expectedOutput/calcite/chart_multiple_group_keys.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/terms_significant_1.yamlinteg-test/src/test/resources/expectedOutput/calcite/chart_with_integer_span.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_agg_with_script.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_in_correlated_subquery.yaml
📚 Learning: 2025-12-11T05:27:39.856Z
Learnt from: LantaoJin
Repo: opensearch-project/sql PR: 0
File: :0-0
Timestamp: 2025-12-11T05:27:39.856Z
Learning: In opensearch-project/sql, for SEMI and ANTI join types in CalciteRelNodeVisitor.java, the `max` option has no effect because these join types only use the left side to filter records based on the existence of matches in the right side. The join results are identical regardless of max value (max=1, max=2, or max=∞). The early return for SEMI/ANTI joins before processing the `max` option is intentional and correct behavior.
Applied to files:
integ-test/src/test/resources/expectedOutput/calcite/clickbench/q28.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q29.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_complex_sort_expr_pushdown_for_smj_max_option.yamlinteg-test/src/test/resources/expectedOutput/calcite/chart_null_str.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_multisearch_timestamp.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_join_with_field_list_max_option.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_with_expr3_alternative.yamlinteg-test/src/test/resources/expectedOutput/calcite/chart_use_other.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/dedup_metrics_size_field.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_complex_sort_expr_pushdown_for_smj_w_max_option.yamlinteg-test/src/test/resources/expectedOutput/calcite/chart_timestamp_span_and_category.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_exists_correlated_subquery.yamlinteg-test/src/test/resources/expectedOutput/calcite/chart_with_integer_span.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_in_correlated_subquery.yaml
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*.java : Document Calcite-specific workarounds in code
Applied to files:
core/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.javainteg-test/src/test/resources/expectedOutput/calcite/chart_null_str.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_complex4.yamlinteg-test/src/test/resources/expectedOutput/calcite/chart_use_other.yamlinteg-test/src/test/resources/expectedOutput/calcite/chart_multiple_group_keys.yaml
📚 Learning: 2025-12-29T05:32:03.491Z
Learnt from: LantaoJin
Repo: opensearch-project/sql PR: 4993
File: opensearch/src/main/java/org/opensearch/sql/opensearch/planner/physical/CalciteEnumerableTopK.java:20-20
Timestamp: 2025-12-29T05:32:03.491Z
Learning: In opensearch-project/sql, when creating custom Calcite RelNode classes that extend EnumerableLimitSort or other Calcite RelNode types, always override the `copy` method. Without overriding copy, the class will downgrade to its parent class type during copy operations, losing the custom implementation.
Applied to files:
integ-test/src/test/resources/expectedOutput/calcite/explain_exists_correlated_subquery.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_in_correlated_subquery.yaml
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Follow existing patterns in `CalciteRelNodeVisitor` and `CalciteRexNodeVisitor` for Calcite integration
Applied to files:
integ-test/src/test/resources/expectedOutput/calcite/explain_exists_correlated_subquery.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_in_correlated_subquery.yaml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (28)
- GitHub Check: build-linux (21, unit)
- GitHub Check: build-linux (21, integration)
- GitHub Check: build-linux (21, doc)
- GitHub Check: build-linux (25, doc)
- GitHub Check: build-linux (25, unit)
- GitHub Check: build-linux (25, integration)
- GitHub Check: bwc-tests-rolling-upgrade (21)
- GitHub Check: bwc-tests-rolling-upgrade (25)
- GitHub Check: bwc-tests-full-restart (25)
- GitHub Check: security-it-linux (25)
- GitHub Check: bwc-tests-full-restart (21)
- GitHub Check: security-it-linux (21)
- GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, integration)
- GitHub Check: build-windows-macos (macos-14, 21, doc)
- GitHub Check: build-windows-macos (macos-14, 25, unit)
- GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, integration)
- GitHub Check: build-windows-macos (macos-14, 25, doc)
- GitHub Check: build-windows-macos (macos-14, 21, unit)
- GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, unit)
- GitHub Check: build-windows-macos (macos-14, 25, integration)
- GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, unit)
- GitHub Check: build-windows-macos (macos-14, 21, integration)
- GitHub Check: security-it-windows-macos (macos-14, 21)
- GitHub Check: security-it-windows-macos (macos-14, 25)
- GitHub Check: security-it-windows-macos (windows-latest, 21)
- GitHub Check: security-it-windows-macos (windows-latest, 25)
- GitHub Check: CodeQL-Scan (java)
- GitHub Check: test-sql-cli-integration (21)
integ-test/src/test/resources/expectedOutput/calcite/explain_complex_dedup.yaml
Outdated
Show resolved
Hide resolved
integ-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr.yaml
Show resolved
Hide resolved
integ-test/src/test/resources/expectedOutput/calcite/explain_dedup_with_expr3.yaml
Outdated
Show resolved
Hide resolved
penghuo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
| * @return {@code true} if the exception should be tolerated as a known Calcite bug, {@code false} | ||
| * otherwise | ||
| */ | ||
| public static boolean tolerantValidationException(Exception e) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How many cases will hit this? It will use the original plan if it's tolerant, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Roughly 5 cases. Yes, it will use the original plan if hit
| private RelNode validate(RelNode relNode, CalcitePlanContext context) { | ||
| SkipRelValidationShuttle skipShuttle = new SkipRelValidationShuttle(); | ||
| relNode.accept(skipShuttle); | ||
| if (skipShuttle.shouldSkipValidation()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a limitation? How many cases in our IT will hit this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, they are limitations, either from our implementation or calite's. Around a dozen ITs will hit this.
| import org.apache.calcite.sql2rel.SqlToRelConverter; | ||
| import org.checkerframework.checker.nullness.qual.Nullable; | ||
|
|
||
| public class PplSqlToRelConverter extends SqlToRelConverter { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we've extended Calcite's SqlToRelConverter, we have chance to customized our plan by using hints.
e.g. When converting RelNode to SqlNode, we convert LogicalDedup to SqlSelect with dedup hints. And later when converting SqlNode to RelNode, we can detect that hint and add LogicalDedup on the original LogicalProject.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I was also considering this approach. I hesitated because I was not sure which sql operation conforms with dedup's attributes. SqlSelect should to be a reasonable one.
SQL hints are already used to the empty bucket hints in the current implementation.
qianheng-aws
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious how much overhead will this validation process introduce?
|
This PR validate the whole SQL including type compatible, SQL grammer, field access and others. However what we need is only type compatibility since the relbuilder will not build an incorrect plan. I'm thinking if we can only convert recall to sqlnode and invoke validateCall to do that part of validation(should include type coercion), and then convert it back to recall. |
It's kind of tricky, the validation of these are deeply entangled. E.g. if a function call is at a very bottom level of the AST, the validator has to go down the AST and validate all the nodes it comes across to build a proper validation scope for the function call. If we want to try to build them for these function calls, we may need to replicate those steps, which does not differ much from the original validation. |
Signed-off-by: Yuanchun Shen <[email protected]>
Signed-off-by: Yuanchun Shen <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
docs/user/ppl/interfaces/endpoint.md (1)
56-64: Fix duplicate "Description" section and add documentation for SAFE_CAST behavior in explain output.The Explain section has a duplicate "### Description" heading at lines 58 and 61 that should be consolidated into a single introduction.
Additionally, while the examples show
SAFE_CASTin the logical plan output (e.g.,SAFE_CAST(30:BIGINT)), there is no explanation of what SAFE_CAST is, how it differs from CAST, or why it appears in the output. For users upgrading from previous versions and any automated tooling that parses explain output, clarification would be valuable:
- Explain that SAFE_CAST is used in logical plans for type-safe casting with malformed string handling
- Note that the physical plan shows the optimized form without SAFE_CAST (as seen in examples where logical shows
SAFE_CAST(30:BIGINT)but physical shows>($5, 30))- Consider a brief note that this represents a change in output format for users parsing explain responses
🤖 Fix all issues with AI agents
In @core/src/main/java/org/opensearch/sql/calcite/CalcitePlanContext.java:
- Around line 129-168: The getValidator method in CalcitePlanContext currently
calls connection.createStatement() without ensuring the Statement is closed;
change the try/catch that obtains CalciteServerStatement to a try-with-resources
that declares a Statement (e.g., try (Statement stmt =
connection.createStatement()) { CalciteServerStatement statement =
stmt.unwrap(CalciteServerStatement.class); ... } ), move the validator creation
into that try block so the Statement is closed automatically, and remove the old
catch/throw pattern while preserving the existing IllegalStateException and
validator initialization logic.
🧹 Nitpick comments (5)
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLBasicIT.java (1)
458-465: AddverifySchema()for consistency with other BETWEEN tests.All other BETWEEN tests in this file (e.g.,
testBetween,testBetweenWithDifferentTypes) verify the schema before checking data rows. This test should follow the same pattern for consistency.♻️ Proposed fix
@Test public void testBetweenWithMixedTypes() throws IOException { JSONObject actual = executeQuery( String.format( "source=%s | where age between '35' and 38 | fields firstname, age", TEST_INDEX_BANK)); + verifySchema(actual, schema("firstname", "string"), schema("age", "int")); verifyDataRows(actual, rows("Hattie", 36), rows("Elinor", 36)); }docs/user/ppl/interfaces/endpoint.md (1)
122-124: Clarify the different constant representations across plan formats.The same literal constant
30appears in three different forms across the extended format output:
- Logical plan:
SAFE_CAST(30:BIGINT)(line 122)- Physical plan:
30:BIGINT(line 123)- Extended Java code:
30L(line 124)While these representations serve different purposes (logical abstraction, physical optimization, runtime code), users may find the variations confusing without explanation.
Consider adding a brief explanation in the documentation about how the same operation is represented differently across plan stages, or reference Calcite's documentation on plan representations.
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java (1)
1810-1818: Note the TODO comment regarding pushdown behavior.The TODO on line 1810 indicates that the chart with
useother=truescenario currently doesn't achieve pushdown as expected. This is flagged for future investigation.Consider creating a tracking issue for this if one doesn't already exist, to ensure it's addressed in a follow-up.
Would you like me to help create a GitHub issue to track this pushdown optimization opportunity?
core/src/main/java/org/opensearch/sql/executor/QueryService.java (1)
283-341: Well-structured validation pipeline with comprehensive JavaDoc.The
validatemethod correctly implements the RelNode → SqlNode → validate → RelNode round-trip as per PR objectives. A few observations:
Line 311:
Objects.requireNonNull(rewritten)will throw NPE ifSqlRewriteShuttlereturns null. Consider whether a more descriptive exception would aid debugging.Line 314: Silent fallback for tolerant validation exceptions may make debugging difficult. Consider adding debug-level logging when falling back.
The method is at the ~50-line threshold per coding guidelines. The current organization is logical, but if it grows further, consider extracting the
SqlToRelConverterconfiguration into a helper.💡 Optional: Add debug logging for validation fallback
} catch (CalciteContextException e) { if (ValidationUtils.tolerantValidationException(e)) { + log.debug("Tolerant validation fallback triggered: {}", e.getMessage()); return relNode; } throw new ExpressionEvaluationException(e.getMessage(), e); }core/src/main/java/org/opensearch/sql/calcite/CalcitePlanContext.java (1)
84-93: Static operatorTableProvider requires careful initialization ordering.The static
operatorTableProvidermust be set before anyCalcitePlanContextcallsgetValidator(). TheIllegalStateExceptionat line 141-142 guards against this, but consider documenting the initialization requirement in the class JavaDoc.📝 Optional: Add class-level documentation for initialization order
/** * ...existing class doc... * * <p><b>Initialization:</b> {@link #setOperatorTableProvider(SqlOperatorTableProvider)} * must be called during module initialization before any context creates a validator. */ public class CalcitePlanContext {
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (76)
core/src/main/java/org/opensearch/sql/calcite/CalcitePlanContext.javacore/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.javacore/src/main/java/org/opensearch/sql/calcite/CalciteRexNodeVisitor.javacore/src/main/java/org/opensearch/sql/calcite/utils/CalciteToolsHelper.javacore/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.javacore/src/main/java/org/opensearch/sql/executor/QueryService.javadocs/user/ppl/interfaces/endpoint.mdinteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLAggregationIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLBasicIT.javainteg-test/src/test/resources/expectedOutput/calcite/access_struct_subfield_with_item.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/composite_date_histogram_daily.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/composite_terms.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/composite_terms_keyword.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/date_histogram_minute_agg.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/dedup_metrics_size_field.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/keyword_in_range.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/multi_terms_keyword.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/sort_keyword_can_match_shortcut.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/sort_keyword_no_can_match_shortcut.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/sort_numeric_asc_with_match.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/sort_numeric_desc_with_match.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q11.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q12.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q13.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q14.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q15.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q22.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q23.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q28.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q31.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q32.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q37.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q38.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q39.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q41.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q41_alternative.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q42.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q43.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q8.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_bin_aligntime.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_bin_span.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_complex_sort_expr_pushdown_for_smj.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_complex_sort_expr_pushdown_for_smj_w_max_option.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_complex1.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_complex2.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_complex3.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_complex4.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr1.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr2.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr2_alternative.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr3.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr3_alternative.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr4.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr4_alternative.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr_complex1.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr_complex2.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_keepempty_false_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_with_expr1.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_with_expr2.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_with_expr3.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_with_expr4.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_with_expr4_alternative.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_eval_max.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_eval_min.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_fillnull_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_fillnull_value_syntax.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_filter_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_filter_push_compare_date_string.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_filter_push_compare_time_string.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_filter_push_compare_timestamp_string.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_filter_with_search.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_join_with_criteria_max_option.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_join_with_fields_max_option.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_limit_push.yaml
✅ Files skipped from review due to trivial changes (1)
- integ-test/src/test/resources/expectedOutput/calcite/clickbench/q41_alternative.yaml
🚧 Files skipped from review as they are similar to previous changes (12)
- integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLAggregationIT.java
- integ-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr4_alternative.yaml
- integ-test/src/test/resources/expectedOutput/calcite/explain_fillnull_value_syntax.yaml
- integ-test/src/test/resources/expectedOutput/calcite/explain_filter_push_compare_time_string.yaml
- integ-test/src/test/resources/expectedOutput/calcite/explain_dedup_complex2.yaml
- integ-test/src/test/resources/expectedOutput/calcite/explain_fillnull_push.yaml
- integ-test/src/test/resources/expectedOutput/calcite/big5/sort_keyword_no_can_match_shortcut.yaml
- integ-test/src/test/resources/expectedOutput/calcite/explain_eval_max.yaml
- core/src/main/java/org/opensearch/sql/calcite/CalciteRexNodeVisitor.java
- integ-test/src/test/resources/expectedOutput/calcite/access_struct_subfield_with_item.yaml
- integ-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr3_alternative.yaml
- integ-test/src/test/resources/expectedOutput/calcite/big5/composite_date_histogram_daily.yaml
🧰 Additional context used
📓 Path-based instructions (8)
**/*.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
**/*.java: UsePascalCasefor class names (e.g.,QueryExecutor)
UsecamelCasefor method and variable names (e.g.,executeQuery)
UseUPPER_SNAKE_CASEfor constants (e.g.,MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
PreferOptional<T>for nullable returns in Java
Avoid unnecessary object creation in loops
UseStringBuilderfor string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.javacore/src/main/java/org/opensearch/sql/calcite/utils/CalciteToolsHelper.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLBasicIT.javacore/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.javacore/src/main/java/org/opensearch/sql/executor/QueryService.javacore/src/main/java/org/opensearch/sql/calcite/CalcitePlanContext.javacore/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
⚙️ CodeRabbit configuration file
**/*.java: - Flag methods >50 lines as potentially too complex - suggest refactoring
- Flag classes >500 lines as needing organization review
- Check for dead code, unused imports, and unused variables
- Identify code reuse opportunities across similar implementations
- Assess holistic maintainability - is code easy to understand and modify?
- Flag code that appears AI-generated without sufficient human review
- Verify Java naming conventions (PascalCase for classes, camelCase for methods/variables)
- Check for proper JavaDoc on public classes and methods
- Flag redundant comments that restate obvious code
- Ensure proper error handling with specific exception types
- Check for Optional usage instead of null returns
- Validate proper use of try-with-resources for resource management
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.javacore/src/main/java/org/opensearch/sql/calcite/utils/CalciteToolsHelper.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLBasicIT.javacore/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.javacore/src/main/java/org/opensearch/sql/executor/QueryService.javacore/src/main/java/org/opensearch/sql/calcite/CalcitePlanContext.javacore/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
integ-test/**/*IT.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
End-to-end scenarios need integration tests in
integ-test/module
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLBasicIT.java
⚙️ CodeRabbit configuration file
integ-test/**/*IT.java: - Integration tests MUST use valid test data from resources
- Verify test data files exist in integ-test/src/test/resources/
- Check test assertions are meaningful and specific
- Validate tests clean up resources after execution
- Ensure tests are independent and can run in any order
- Flag tests that reference non-existent indices (e.g., EMP)
- Verify integration tests are in correct module (integ-test/)
- Check tests can be run with ./gradlew :integ-test:integTest
- Ensure proper test data setup and teardown
- Validate end-to-end scenario coverage
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLBasicIT.java
**/*IT.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
Name integration tests with
*IT.javasuffix in OpenSearch SQL
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLBasicIT.java
**/test/**/*.java
⚙️ CodeRabbit configuration file
**/test/**/*.java: - Verify NULL input tests for all new functions
- Check boundary condition tests (min/max values, empty inputs)
- Validate error condition tests (invalid inputs, exceptions)
- Ensure multi-document tests for per-document operations
- Flag smoke tests without meaningful assertions
- Check test naming follows pattern: test
- Verify test data is realistic and covers edge cases
- Verify test coverage for new business logic
- Ensure tests are independent and don't rely on execution order
- Validate meaningful test data that reflects real-world scenarios
- Check for proper cleanup of test resources
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLBasicIT.java
**/calcite/**/*.java
⚙️ CodeRabbit configuration file
**/calcite/**/*.java: - Follow existing Calcite integration patterns
- Verify RelNode visitor implementations are complete
- Check RexNode handling follows project conventions
- Validate SQL generation is correct and optimized
- Ensure Calcite version compatibility
- Follow existing patterns in CalciteRelNodeVisitor and CalciteRexNodeVisitor
- Document any Calcite-specific workarounds
- Test compatibility with Calcite version constraints
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.javacore/src/main/java/org/opensearch/sql/calcite/utils/CalciteToolsHelper.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLBasicIT.javacore/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.javacore/src/main/java/org/opensearch/sql/calcite/CalcitePlanContext.javacore/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
integ-test/src/test/resources/**/*
⚙️ CodeRabbit configuration file
integ-test/src/test/resources/**/*: - Verify test data is realistic and representative
- Check data format matches expected schema
- Ensure test data covers edge cases and boundary conditions
Files:
integ-test/src/test/resources/expectedOutput/calcite/explain_dedup_keepempty_false_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr4.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_eval_min.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr2.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/composite_terms.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr_complex2.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_join_with_criteria_max_option.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_complex4.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_limit_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_complex1.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q32.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr3.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr_complex1.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_bin_span.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q31.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_complex3.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_join_with_fields_max_option.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr2_alternative.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_with_expr1.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_complex_sort_expr_pushdown_for_smj.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/keyword_in_range.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_with_expr4.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q37.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/date_histogram_minute_agg.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q38.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q28.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr1.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/sort_numeric_asc_with_match.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q23.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_filter_with_search.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q11.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_with_expr4_alternative.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q41.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q8.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q42.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/dedup_metrics_size_field.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q14.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q22.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_with_expr2.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q39.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q15.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/composite_terms_keyword.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q12.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_with_expr3.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/sort_keyword_can_match_shortcut.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q13.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/multi_terms_keyword.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_filter_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_bin_aligntime.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_filter_push_compare_timestamp_string.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/sort_numeric_desc_with_match.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_filter_push_compare_date_string.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_complex_sort_expr_pushdown_for_smj_w_max_option.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q43.yaml
core/src/main/java/**/*.java
⚙️ CodeRabbit configuration file
core/src/main/java/**/*.java: - New functions MUST have unit tests in the same commit
Files:
core/src/main/java/org/opensearch/sql/calcite/utils/CalciteToolsHelper.javacore/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.javacore/src/main/java/org/opensearch/sql/executor/QueryService.javacore/src/main/java/org/opensearch/sql/calcite/CalcitePlanContext.javacore/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
⚙️ CodeRabbit configuration file
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java: - Flag methods >50 lines - this file is known to be hard to read
- Suggest extracting complex logic into helper methods
- Check for code organization and logical grouping
- Validate all RelNode types are handled
Files:
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
🧠 Learnings (9)
📓 Common learnings
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes
Applied to files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.javainteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_keepempty_false_push.yamlcore/src/main/java/org/opensearch/sql/calcite/utils/CalciteToolsHelper.javainteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr4.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_eval_min.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr2.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/composite_terms.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr_complex2.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_join_with_criteria_max_option.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_complex4.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_limit_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_complex1.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q32.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr3.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr_complex1.yamlinteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLBasicIT.javainteg-test/src/test/resources/expectedOutput/calcite/explain_bin_span.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q31.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_complex3.yamlcore/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.javainteg-test/src/test/resources/expectedOutput/calcite/explain_join_with_fields_max_option.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr2_alternative.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_with_expr1.yamlcore/src/main/java/org/opensearch/sql/executor/QueryService.javainteg-test/src/test/resources/expectedOutput/calcite/explain_complex_sort_expr_pushdown_for_smj.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/keyword_in_range.yamlcore/src/main/java/org/opensearch/sql/calcite/CalcitePlanContext.javainteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_with_expr4.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q37.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/date_histogram_minute_agg.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q38.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q28.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr1.yamldocs/user/ppl/interfaces/endpoint.mdinteg-test/src/test/resources/expectedOutput/calcite/big5/sort_numeric_asc_with_match.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q23.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_filter_with_search.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q11.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_with_expr4_alternative.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q41.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q8.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q42.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/dedup_metrics_size_field.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q14.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q22.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_with_expr2.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q39.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q15.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/composite_terms_keyword.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q12.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_with_expr3.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/sort_keyword_can_match_shortcut.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q13.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/multi_terms_keyword.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_filter_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_bin_aligntime.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_filter_push_compare_timestamp_string.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/sort_numeric_desc_with_match.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_filter_push_compare_date_string.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_complex_sort_expr_pushdown_for_smj_w_max_option.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q43.yaml
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*.java : Document Calcite-specific workarounds in code
Applied to files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.javacore/src/main/java/org/opensearch/sql/calcite/utils/CalciteToolsHelper.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLBasicIT.javacore/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.javacore/src/main/java/org/opensearch/sql/executor/QueryService.javacore/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*IT.java : Name integration tests with `*IT.java` suffix in OpenSearch SQL
Applied to files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLBasicIT.java
📚 Learning: 2025-12-29T05:32:03.491Z
Learnt from: LantaoJin
Repo: opensearch-project/sql PR: 4993
File: opensearch/src/main/java/org/opensearch/sql/opensearch/planner/physical/CalciteEnumerableTopK.java:20-20
Timestamp: 2025-12-29T05:32:03.491Z
Learning: For any custom Calcite RelNode class (e.g., ones that extend EnumerableLimitSort or other Calcite RelNode types), always override the copy method. If copy is not overridden, cloning/copy operations may downgrade the instance to the parent class type, losing the custom behavior. In your implementation, ensure copy returns a new instance of the concrete class with all relevant fields and traits preserved, mirroring the current instance state.
Applied to files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.javacore/src/main/java/org/opensearch/sql/calcite/utils/CalciteToolsHelper.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLBasicIT.javacore/src/main/java/org/opensearch/sql/calcite/CalcitePlanContext.javacore/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
📚 Learning: 2025-12-29T05:32:11.893Z
Learnt from: LantaoJin
Repo: opensearch-project/sql PR: 4993
File: opensearch/src/main/java/org/opensearch/sql/opensearch/planner/physical/CalciteEnumerableTopK.java:20-20
Timestamp: 2025-12-29T05:32:11.893Z
Learning: In opensearch-project/sql, when creating custom Calcite RelNode classes that extend EnumerableLimitSort or other Calcite RelNode types, always override the `copy` method. Without overriding copy, the class will downgrade to its parent class type during copy operations, losing the custom implementation.
Applied to files:
integ-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr4.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_join_with_criteria_max_option.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_complex4.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr3.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_complex3.yamlcore/src/main/java/org/opensearch/sql/executor/QueryService.javainteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr1.yamldocs/user/ppl/interfaces/endpoint.mdinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_with_expr3.yaml
📚 Learning: 2025-12-11T05:27:39.856Z
Learnt from: LantaoJin
Repo: opensearch-project/sql PR: 0
File: :0-0
Timestamp: 2025-12-11T05:27:39.856Z
Learning: In opensearch-project/sql, for SEMI and ANTI join types in CalciteRelNodeVisitor.java, the `max` option has no effect because these join types only use the left side to filter records based on the existence of matches in the right side. The join results are identical regardless of max value (max=1, max=2, or max=∞). The early return for SEMI/ANTI joins before processing the `max` option is intentional and correct behavior.
Applied to files:
integ-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr4.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_join_with_criteria_max_option.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_limit_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_join_with_fields_max_option.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_with_expr4.yamldocs/user/ppl/interfaces/endpoint.mdinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_with_expr4_alternative.yamlcore/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.javainteg-test/src/test/resources/expectedOutput/calcite/explain_bin_aligntime.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_filter_push_compare_date_string.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_complex_sort_expr_pushdown_for_smj_w_max_option.yaml
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Follow existing patterns in `CalciteRelNodeVisitor` and `CalciteRexNodeVisitor` for Calcite integration
Applied to files:
core/src/main/java/org/opensearch/sql/executor/QueryService.javacore/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Development requires JDK 21 for the OpenSearch SQL project
Applied to files:
core/src/main/java/org/opensearch/sql/executor/QueryService.javacore/src/main/java/org/opensearch/sql/calcite/CalcitePlanContext.java
🧬 Code graph analysis (2)
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLBasicIT.java (1)
integ-test/src/test/java/org/opensearch/sql/util/MatcherUtils.java (1)
MatcherUtils(43-502)
core/src/main/java/org/opensearch/sql/executor/QueryService.java (9)
core/src/main/java/org/opensearch/sql/calcite/validate/OpenSearchSparkSqlDialect.java (1)
OpenSearchSparkSqlDialect(28-113)core/src/main/java/org/opensearch/sql/calcite/validate/converters/PplRelToSqlNodeConverter.java (1)
PplRelToSqlNodeConverter(36-172)core/src/main/java/org/opensearch/sql/calcite/validate/converters/PplSqlToRelConverter.java (1)
PplSqlToRelConverter(24-92)core/src/main/java/org/opensearch/sql/calcite/validate/shuttles/PplRelToSqlRelShuttle.java (1)
PplRelToSqlRelShuttle(28-83)core/src/main/java/org/opensearch/sql/calcite/validate/shuttles/SkipRelValidationShuttle.java (1)
SkipRelValidationShuttle(85-175)core/src/main/java/org/opensearch/sql/calcite/validate/shuttles/SqlRewriteShuttle.java (1)
SqlRewriteShuttle(21-69)common/src/main/java/org/opensearch/sql/common/utils/QueryContext.java (1)
QueryContext(18-87)core/src/main/java/org/opensearch/sql/exception/CalciteUnsupportedException.java (1)
CalciteUnsupportedException(8-17)core/src/main/java/org/opensearch/sql/exception/ExpressionEvaluationException.java (1)
ExpressionEvaluationException(9-18)
| public SqlValidator getValidator() { | ||
| if (validator == null) { | ||
| synchronized (this) { | ||
| // Double-Checked Locking for thread-safety | ||
| if (validator == null) { | ||
| final CalciteServerStatement statement; | ||
| try { | ||
| statement = connection.createStatement().unwrap(CalciteServerStatement.class); | ||
| } catch (SQLException e) { | ||
| throw new RuntimeException(e); | ||
| } | ||
| if (operatorTableProvider == null) { | ||
| throw new IllegalStateException( | ||
| "SqlOperatorTableProvider must be set before creating CalcitePlanContext"); | ||
| } | ||
| SqlValidator.Config validatorConfig = | ||
| SqlValidator.Config.DEFAULT | ||
| .withTypeCoercionRules(PplTypeCoercionRule.instance()) | ||
| .withTypeCoercionFactory(PplTypeCoercion::create) | ||
| // Use lenient conformance for PPL compatibility | ||
| .withConformance(OpenSearchSparkSqlDialect.DEFAULT.getConformance()) | ||
| // Use Spark SQL's NULL collation (NULLs sorted LOW/FIRST) | ||
| .withDefaultNullCollation(NullCollation.LOW) | ||
| // This ensures that coerced arguments are replaced with cast version in sql | ||
| // select list because coercion is performed during select list expansion during | ||
| // sql validation. Affects 4356.yml | ||
| // See SqlValidatorImpl#validateSelectList and AggConverter#translateAgg | ||
| .withIdentifierExpansion(true); | ||
| validator = | ||
| PplValidator.create( | ||
| statement, | ||
| config, | ||
| operatorTableProvider.getOperatorTable(), | ||
| TYPE_FACTORY, | ||
| validatorConfig); | ||
| } | ||
| } | ||
| } | ||
| return validator; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
rg -n "CalciteServerStatement" --type java -B 2 -A 2Repository: opensearch-project/sql
Length of output: 5211
🏁 Script executed:
# Check if CalciteServerStatement is defined in this codebase or imported from Calcite
rg -n "class CalciteServerStatement" --type javaRepository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
# Check for try-with-resources usage with statements in similar code
rg -n "createStatement()" --type java -B 2 -A 2 | head -50Repository: opensearch-project/sql
Length of output: 4675
🏁 Script executed:
# Look at the imports and see where CalciteServerStatement comes from
head -50 core/src/main/java/org/opensearch/sql/calcite/CalcitePlanContext.javaRepository: opensearch-project/sql
Length of output: 2005
🏁 Script executed:
# Search for close() calls on statement variables
rg -n "statement\.close()" --type javaRepository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
# Check CalciteToolsHelper usage after statement creation
sed -n '160,180p' core/src/main/java/org/opensearch/sql/calcite/utils/CalciteToolsHelper.javaRepository: opensearch-project/sql
Length of output: 891
🏁 Script executed:
# Check how validator is used after it's created - is it long-lived?
sed -n '129,200p' core/src/main/java/org/opensearch/sql/calcite/CalcitePlanContext.javaRepository: opensearch-project/sql
Length of output: 2766
🏁 Script executed:
# Look at full method context of CalciteToolsHelper where statement is created
sed -n '140,180p' core/src/main/java/org/opensearch/sql/calcite/utils/CalciteToolsHelper.javaRepository: opensearch-project/sql
Length of output: 1652
Wrap connection.createStatement() in try-with-resources for proper resource cleanup.
The CalciteServerStatement created on line 136 is never explicitly closed. While this pattern is used consistently elsewhere in the codebase (e.g., CalciteToolsHelper.java:168), the coding guidelines require using try-with-resources for proper resource cleanup. Either use try-with-resources or document why this Calcite framework object does not require explicit closing.
Example fix:
final CalciteServerStatement statement;
try (Statement stmt = connection.createStatement()) {
statement = stmt.unwrap(CalciteServerStatement.class);
// rest of initialization
}🤖 Prompt for AI Agents
In @core/src/main/java/org/opensearch/sql/calcite/CalcitePlanContext.java around
lines 129 - 168, The getValidator method in CalcitePlanContext currently calls
connection.createStatement() without ensuring the Statement is closed; change
the try/catch that obtains CalciteServerStatement to a try-with-resources that
declares a Statement (e.g., try (Statement stmt = connection.createStatement())
{ CalciteServerStatement statement = stmt.unwrap(CalciteServerStatement.class);
... } ), move the validator creation into that try block so the Statement is
closed automatically, and remove the old catch/throw pattern while preserving
the existing IllegalStateException and validator initialization logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
docs/user/ppl/functions/collection.md (1)
165-201: Two TRANSFORM examples are appropriately disabled pending issue #4972, but note that MVMAP has the same pattern with three similarly disabled examples.The TRANSFORM function examples at lines 167–172 and 185–190 are marked with
ppl ignoreand reference issue #4972. This is a consistent pattern—other lambda-based functions like FORALL, EXISTS, FILTER, and REDUCE retain working examples, suggesting the issue specifically affects TRANSFORM and MVMAP transformations. Verify that issue #4972 adequately captures all affected examples (TRANSFORM: 2 examples; MVMAP: 3 examples at lines 819–824, 837–842, 859–864) to ensure comprehensive tracking.
🧹 Nitpick comments (1)
docs/user/ppl/functions/collection.md (1)
818-875: MVMAP examples disabled with same pattern as TRANSFORM.Three MVMAP examples are disabled, consistent with the TRANSFORM failures. Both functions involve array element transformations (lambdas for TRANSFORM, expressions for MVMAP).
Please verify that issue #4972 explicitly covers both TRANSFORM and MVMAP failures, as the root cause may be related to how the new Calcite validation handles array element-level operations.
Consider consolidating the TODO comments or creating a more descriptive issue title that clearly indicates "lambda and array transformation validation" to help future maintainers understand the scope.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
docs/user/ppl/cmd/bin.mddocs/user/ppl/cmd/patterns.mddocs/user/ppl/functions/collection.mddocs/user/ppl/functions/conversion.md
🚧 Files skipped from review as they are similar to previous changes (2)
- docs/user/ppl/cmd/patterns.md
- docs/user/ppl/functions/conversion.md
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (28)
- GitHub Check: build-linux (21, integration)
- GitHub Check: build-linux (21, unit)
- GitHub Check: build-linux (25, integration)
- GitHub Check: bwc-tests-full-restart (21)
- GitHub Check: build-linux (25, unit)
- GitHub Check: bwc-tests-rolling-upgrade (25)
- GitHub Check: build-linux (21, doc)
- GitHub Check: build-linux (25, doc)
- GitHub Check: bwc-tests-full-restart (25)
- GitHub Check: bwc-tests-rolling-upgrade (21)
- GitHub Check: security-it-linux (25)
- GitHub Check: security-it-linux (21)
- GitHub Check: build-windows-macos (macos-14, 25, unit)
- GitHub Check: build-windows-macos (macos-14, 25, doc)
- GitHub Check: build-windows-macos (macos-14, 25, integration)
- GitHub Check: build-windows-macos (macos-14, 21, unit)
- GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, integration)
- GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, integration)
- GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, unit)
- GitHub Check: build-windows-macos (macos-14, 21, doc)
- GitHub Check: build-windows-macos (macos-14, 21, integration)
- GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, unit)
- GitHub Check: CodeQL-Scan (java)
- GitHub Check: security-it-windows-macos (macos-14, 25)
- GitHub Check: security-it-windows-macos (windows-latest, 25)
- GitHub Check: security-it-windows-macos (windows-latest, 21)
- GitHub Check: security-it-windows-macos (macos-14, 21)
- GitHub Check: test-sql-cli-integration (21)
🔇 Additional comments (1)
docs/user/ppl/cmd/bin.md (1)
501-508: The example correctly uses theppl ignoredirective, which is the documented approach for marking code blocks that should be skipped during testing. This pattern is consistently used throughout the codebase (inlimitations.md,functions/collection.md, and others) to handle incomplete or broken examples—paired with HTML TODO comments explaining the issue. No changes are needed; the implementation follows established conventions.Likely an incorrect or invalid review comment.
Signed-off-by: Yuanchun Shen <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteBinCommandIT.java (1)
1014-1017: Test assertion doesn't match documented intent.The test comment (lines 1004-1008) states it should verify "the error message clearly explains the limitation and suggests solutions," but the assertion now checks for an internal Calcite error (
resolving method 'minus[...]') that provides no user-friendly guidance.The TODO referencing issue #4973 acknowledges this is a temporary state. When addressing that issue, consider updating the error handling to surface a user-friendly message explaining that
binson timestamp fields requires pushdown to be enabled, rather than exposing internal Calcite method resolution failures.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
docs/user/ppl/cmd/bin.mddocs/user/ppl/cmd/patterns.mddocs/user/ppl/functions/collection.mddocs/user/ppl/functions/conversion.mdinteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteBinCommandIT.javainteg-test/src/test/resources/expectedOutput/calcite/clickbench/q41_alternative.yaml
🚧 Files skipped from review as they are similar to previous changes (5)
- docs/user/ppl/cmd/bin.md
- docs/user/ppl/cmd/patterns.md
- integ-test/src/test/resources/expectedOutput/calcite/clickbench/q41_alternative.yaml
- docs/user/ppl/functions/conversion.md
- docs/user/ppl/functions/collection.md
🧰 Additional context used
📓 Path-based instructions (5)
**/*.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
**/*.java: UsePascalCasefor class names (e.g.,QueryExecutor)
UsecamelCasefor method and variable names (e.g.,executeQuery)
UseUPPER_SNAKE_CASEfor constants (e.g.,MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
PreferOptional<T>for nullable returns in Java
Avoid unnecessary object creation in loops
UseStringBuilderfor string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteBinCommandIT.java
⚙️ CodeRabbit configuration file
**/*.java: - Flag methods >50 lines as potentially too complex - suggest refactoring
- Flag classes >500 lines as needing organization review
- Check for dead code, unused imports, and unused variables
- Identify code reuse opportunities across similar implementations
- Assess holistic maintainability - is code easy to understand and modify?
- Flag code that appears AI-generated without sufficient human review
- Verify Java naming conventions (PascalCase for classes, camelCase for methods/variables)
- Check for proper JavaDoc on public classes and methods
- Flag redundant comments that restate obvious code
- Ensure proper error handling with specific exception types
- Check for Optional usage instead of null returns
- Validate proper use of try-with-resources for resource management
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteBinCommandIT.java
integ-test/**/*IT.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
End-to-end scenarios need integration tests in
integ-test/module
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteBinCommandIT.java
⚙️ CodeRabbit configuration file
integ-test/**/*IT.java: - Integration tests MUST use valid test data from resources
- Verify test data files exist in integ-test/src/test/resources/
- Check test assertions are meaningful and specific
- Validate tests clean up resources after execution
- Ensure tests are independent and can run in any order
- Flag tests that reference non-existent indices (e.g., EMP)
- Verify integration tests are in correct module (integ-test/)
- Check tests can be run with ./gradlew :integ-test:integTest
- Ensure proper test data setup and teardown
- Validate end-to-end scenario coverage
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteBinCommandIT.java
**/*IT.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
Name integration tests with
*IT.javasuffix in OpenSearch SQL
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteBinCommandIT.java
**/test/**/*.java
⚙️ CodeRabbit configuration file
**/test/**/*.java: - Verify NULL input tests for all new functions
- Check boundary condition tests (min/max values, empty inputs)
- Validate error condition tests (invalid inputs, exceptions)
- Ensure multi-document tests for per-document operations
- Flag smoke tests without meaningful assertions
- Check test naming follows pattern: test
- Verify test data is realistic and covers edge cases
- Verify test coverage for new business logic
- Ensure tests are independent and don't rely on execution order
- Validate meaningful test data that reflects real-world scenarios
- Check for proper cleanup of test resources
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteBinCommandIT.java
**/calcite/**/*.java
⚙️ CodeRabbit configuration file
**/calcite/**/*.java: - Follow existing Calcite integration patterns
- Verify RelNode visitor implementations are complete
- Check RexNode handling follows project conventions
- Validate SQL generation is correct and optimized
- Ensure Calcite version compatibility
- Follow existing patterns in CalciteRelNodeVisitor and CalciteRexNodeVisitor
- Document any Calcite-specific workarounds
- Test compatibility with Calcite version constraints
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteBinCommandIT.java
🧠 Learnings (4)
📓 Common learnings
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Follow existing patterns in `CalciteRelNodeVisitor` and `CalciteRexNodeVisitor` for Calcite integration
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Update corresponding AST builder classes when making PPL grammar changes
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Verify AST generation for new PPL parser syntax
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes
Applied to files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteBinCommandIT.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*IT.java : Name integration tests with `*IT.java` suffix in OpenSearch SQL
Applied to files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteBinCommandIT.java
📚 Learning: 2025-12-29T05:32:03.491Z
Learnt from: LantaoJin
Repo: opensearch-project/sql PR: 4993
File: opensearch/src/main/java/org/opensearch/sql/opensearch/planner/physical/CalciteEnumerableTopK.java:20-20
Timestamp: 2025-12-29T05:32:03.491Z
Learning: For any custom Calcite RelNode class (e.g., ones that extend EnumerableLimitSort or other Calcite RelNode types), always override the copy method. If copy is not overridden, cloning/copy operations may downgrade the instance to the parent class type, losing the custom behavior. In your implementation, ensure copy returns a new instance of the concrete class with all relevant fields and traits preserved, mirroring the current instance state.
Applied to files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteBinCommandIT.java
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (28)
- GitHub Check: security-it-linux (25)
- GitHub Check: security-it-linux (21)
- GitHub Check: build-linux (25, doc)
- GitHub Check: build-linux (25, unit)
- GitHub Check: build-linux (25, integration)
- GitHub Check: bwc-tests-rolling-upgrade (21)
- GitHub Check: build-linux (21, unit)
- GitHub Check: build-linux (21, integration)
- GitHub Check: build-linux (21, doc)
- GitHub Check: bwc-tests-full-restart (21)
- GitHub Check: bwc-tests-full-restart (25)
- GitHub Check: bwc-tests-rolling-upgrade (25)
- GitHub Check: build-windows-macos (macos-14, 21, doc)
- GitHub Check: build-windows-macos (macos-14, 21, integration)
- GitHub Check: build-windows-macos (macos-14, 25, unit)
- GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, integration)
- GitHub Check: build-windows-macos (macos-14, 25, doc)
- GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, unit)
- GitHub Check: build-windows-macos (macos-14, 25, integration)
- GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, unit)
- GitHub Check: build-windows-macos (macos-14, 21, unit)
- GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, integration)
- GitHub Check: security-it-windows-macos (macos-14, 25)
- GitHub Check: security-it-windows-macos (windows-latest, 25)
- GitHub Check: security-it-windows-macos (macos-14, 21)
- GitHub Check: security-it-windows-macos (windows-latest, 21)
- GitHub Check: CodeQL-Scan (java)
- GitHub Check: test-sql-cli-integration (21)
Signed-off-by: Yuanchun Shen <[email protected]>
Signed-off-by: Yuanchun Shen <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteBinCommandIT.java (1)
1004-1017: Test assertion doesn't match stated verification intent.The comment states "Verify the error message clearly explains the limitation and suggests solutions" and the assertion message expects "clear error message about bins parameter requirements," but the actual check validates an internal Calcite error (
resolving method 'minus[class java.lang.String, class java.lang.String]'). This cryptic message won't help users understand that they need to enable pushdown for timestamp binning.The TODO referencing #4973 acknowledges this, but consider updating the assertion message (lines 1011-1013) to reflect what's actually being checked, so the test failure message is accurate if the internal error changes:
assertTrue( - "Expected clear error message about bins parameter requirements on timestamp fields, but" - + " got: " + "Expected Calcite method resolution error (tracked by #4973 for proper user message), but" + + " got: " + errorMessage,
🤖 Fix all issues with AI agents
In @docs/user/ppl/cmd/bin.md:
- Around line 501-502: Example 20 in the docs contains a non-functional ppl code
block marked only with an HTML TODO comment; either remove the entire Example 20
block until issue #4973 is resolved or make the limitation obvious to users by
replacing the HTML comment with a visible admonition (warning note) immediately
before the ```ppl ignore``` block that states the feature is not supported and
references issue #4973; ensure the visible note uses the same documentation
styling (admonition/warning) as other docs so users see it when rendered.
🧹 Nitpick comments (5)
integ-test/src/test/resources/expectedOutput/calcite/big5/range_field_conjunction_big_range_big_term_query.yaml (1)
1-1: Consider removing the leading blank line.YAML files typically don't start with a blank line. While this won't affect parsing, it's inconsistent with standard YAML formatting conventions.
Suggested fix
- calcite:integ-test/src/test/resources/expectedOutput/calcite/clickbench/q41_alternative.yaml (1)
12-15: Physical plan demonstrates proper pushdown optimization.The expected physical plan correctly shows:
- Dual-level
EnumerableLimitfor system limit (10000) and user pagination (offset 100, fetch 10)- Aggregation pushed to OpenSearch with
multi_termssized at 110 to satisfyoffset + fetch- Boolean query structure combining term, range, script, and exists filters
One observation: the embedded base64-encoded scripts make it difficult to verify the serialized expressions are correct if they need future updates. Consider adding a comment in the test or documentation explaining how to decode/encode these blocks for maintenance purposes.
integ-test/src/test/resources/expectedOutput/calcite/clickbench/q29.yaml (1)
1-1: Minor: Leading empty line in YAML file.The file begins with an empty line. While this doesn't affect parsing, it's unconventional for YAML files. Consider removing for consistency with other expected output files, or ignore if this is a generated artifact.
integ-test/src/test/resources/expectedOutput/calcite/explain_output.yaml (1)
1-1: Unnecessary leading empty line.The empty line at the start of the file is not needed and could be removed for cleaner formatting.
integ-test/src/test/resources/expectedOutput/calcite/clickbench/q28.yaml (1)
1-1: Minor: Spurious blank line at file start.An empty line at the beginning of the YAML file is unnecessary and could be removed for cleaner formatting, though it won't affect functionality.
🧹 Suggested fix
- calcite: logical: |
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (45)
core/src/main/java/org/opensearch/sql/calcite/validate/PplTypeCoercion.javadocs/user/ppl/cmd/bin.mddocs/user/ppl/cmd/patterns.mddocs/user/ppl/functions/collection.mddocs/user/ppl/functions/conversion.mddocs/user/ppl/interfaces/endpoint.mdinteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteBinCommandIT.javainteg-test/src/test/resources/expectedOutput/calcite/big5/dedup_metrics_size_field.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/range_field_conjunction_big_range_big_term_query.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/range_field_conjunction_small_range_big_term_query.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/range_field_conjunction_small_range_small_term_query.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/range_field_disjunction_big_range_small_term_query.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/range_numeric.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q28.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q29.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q41_alternative.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_agg_paginating_having1.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_agg_paginating_having2.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_agg_paginating_having3.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_complex_sort_expr_pushdown_for_smj_w_max_option.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_complex1.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr1.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr3.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr3_alternative.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_keepempty_false_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_keepempty_true_not_pushed.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_text_type_no_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_with_expr1.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_with_expr4.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_with_expr4_alternative.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_filter.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_filter_agg_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_filter_function_script_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_filter_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_filter_script_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_filter_then_limit_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_join_with_criteria_max_option.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_join_with_fields_max_option.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_limit_10_filter_5_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_limit_then_filter_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_multisearch_basic.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_dedup_not_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_top_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_output.yaml
✅ Files skipped from review due to trivial changes (3)
- integ-test/src/test/resources/expectedOutput/calcite/explain_filter_agg_push.yaml
- integ-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_dedup_not_push.yaml
- integ-test/src/test/resources/expectedOutput/calcite/explain_filter.yaml
🚧 Files skipped from review as they are similar to previous changes (22)
- integ-test/src/test/resources/expectedOutput/calcite/explain_join_with_fields_max_option.yaml
- integ-test/src/test/resources/expectedOutput/calcite/explain_dedup_keepempty_true_not_pushed.yaml
- docs/user/ppl/cmd/patterns.md
- integ-test/src/test/resources/expectedOutput/calcite/explain_dedup_keepempty_false_push.yaml
- integ-test/src/test/resources/expectedOutput/calcite/explain_filter_function_script_push.yaml
- docs/user/ppl/functions/conversion.md
- integ-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr1.yaml
- integ-test/src/test/resources/expectedOutput/calcite/explain_agg_paginating_having1.yaml
- integ-test/src/test/resources/expectedOutput/calcite/explain_agg_paginating_having2.yaml
- docs/user/ppl/interfaces/endpoint.md
- integ-test/src/test/resources/expectedOutput/calcite/explain_dedup_with_expr1.yaml
- core/src/main/java/org/opensearch/sql/calcite/validate/PplTypeCoercion.java
- integ-test/src/test/resources/expectedOutput/calcite/explain_dedup_complex1.yaml
- integ-test/src/test/resources/expectedOutput/calcite/explain_dedup_with_expr4.yaml
- integ-test/src/test/resources/expectedOutput/calcite/explain_dedup_text_type_no_push.yaml
- integ-test/src/test/resources/expectedOutput/calcite/explain_filter_script_push.yaml
- integ-test/src/test/resources/expectedOutput/calcite/explain_filter_then_limit_push.yaml
- integ-test/src/test/resources/expectedOutput/calcite/explain_limit_10_filter_5_push.yaml
- integ-test/src/test/resources/expectedOutput/calcite/big5/range_field_conjunction_small_range_big_term_query.yaml
- integ-test/src/test/resources/expectedOutput/calcite/explain_dedup_push.yaml
- integ-test/src/test/resources/expectedOutput/calcite/explain_join_with_criteria_max_option.yaml
- integ-test/src/test/resources/expectedOutput/calcite/big5/range_numeric.yaml
🧰 Additional context used
📓 Path-based instructions (6)
**/*.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
**/*.java: UsePascalCasefor class names (e.g.,QueryExecutor)
UsecamelCasefor method and variable names (e.g.,executeQuery)
UseUPPER_SNAKE_CASEfor constants (e.g.,MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
PreferOptional<T>for nullable returns in Java
Avoid unnecessary object creation in loops
UseStringBuilderfor string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteBinCommandIT.java
⚙️ CodeRabbit configuration file
**/*.java: - Flag methods >50 lines as potentially too complex - suggest refactoring
- Flag classes >500 lines as needing organization review
- Check for dead code, unused imports, and unused variables
- Identify code reuse opportunities across similar implementations
- Assess holistic maintainability - is code easy to understand and modify?
- Flag code that appears AI-generated without sufficient human review
- Verify Java naming conventions (PascalCase for classes, camelCase for methods/variables)
- Check for proper JavaDoc on public classes and methods
- Flag redundant comments that restate obvious code
- Ensure proper error handling with specific exception types
- Check for Optional usage instead of null returns
- Validate proper use of try-with-resources for resource management
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteBinCommandIT.java
integ-test/**/*IT.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
End-to-end scenarios need integration tests in
integ-test/module
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteBinCommandIT.java
⚙️ CodeRabbit configuration file
integ-test/**/*IT.java: - Integration tests MUST use valid test data from resources
- Verify test data files exist in integ-test/src/test/resources/
- Check test assertions are meaningful and specific
- Validate tests clean up resources after execution
- Ensure tests are independent and can run in any order
- Flag tests that reference non-existent indices (e.g., EMP)
- Verify integration tests are in correct module (integ-test/)
- Check tests can be run with ./gradlew :integ-test:integTest
- Ensure proper test data setup and teardown
- Validate end-to-end scenario coverage
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteBinCommandIT.java
**/*IT.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
Name integration tests with
*IT.javasuffix in OpenSearch SQL
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteBinCommandIT.java
**/test/**/*.java
⚙️ CodeRabbit configuration file
**/test/**/*.java: - Verify NULL input tests for all new functions
- Check boundary condition tests (min/max values, empty inputs)
- Validate error condition tests (invalid inputs, exceptions)
- Ensure multi-document tests for per-document operations
- Flag smoke tests without meaningful assertions
- Check test naming follows pattern: test
- Verify test data is realistic and covers edge cases
- Verify test coverage for new business logic
- Ensure tests are independent and don't rely on execution order
- Validate meaningful test data that reflects real-world scenarios
- Check for proper cleanup of test resources
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteBinCommandIT.java
**/calcite/**/*.java
⚙️ CodeRabbit configuration file
**/calcite/**/*.java: - Follow existing Calcite integration patterns
- Verify RelNode visitor implementations are complete
- Check RexNode handling follows project conventions
- Validate SQL generation is correct and optimized
- Ensure Calcite version compatibility
- Follow existing patterns in CalciteRelNodeVisitor and CalciteRexNodeVisitor
- Document any Calcite-specific workarounds
- Test compatibility with Calcite version constraints
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteBinCommandIT.java
integ-test/src/test/resources/**/*
⚙️ CodeRabbit configuration file
integ-test/src/test/resources/**/*: - Verify test data is realistic and representative
- Check data format matches expected schema
- Ensure test data covers edge cases and boundary conditions
Files:
integ-test/src/test/resources/expectedOutput/calcite/explain_filter_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_limit_then_filter_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q29.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/range_field_conjunction_big_range_big_term_query.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_top_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q41_alternative.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_multisearch_basic.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_complex_sort_expr_pushdown_for_smj_w_max_option.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/range_field_disjunction_big_range_small_term_query.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_output.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q28.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr3_alternative.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/range_field_conjunction_small_range_small_term_query.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_agg_paginating_having3.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_with_expr4_alternative.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/dedup_metrics_size_field.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr3.yaml
🧠 Learnings (7)
📓 Common learnings
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Follow existing patterns in `CalciteRelNodeVisitor` and `CalciteRexNodeVisitor` for Calcite integration
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes
Applied to files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteBinCommandIT.javainteg-test/src/test/resources/expectedOutput/calcite/explain_filter_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_limit_then_filter_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q29.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/range_field_conjunction_big_range_big_term_query.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_top_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q41_alternative.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_multisearch_basic.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_complex_sort_expr_pushdown_for_smj_w_max_option.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/range_field_disjunction_big_range_small_term_query.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_output.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q28.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr3_alternative.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/range_field_conjunction_small_range_small_term_query.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_agg_paginating_having3.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_with_expr4_alternative.yamlinteg-test/src/test/resources/expectedOutput/calcite/big5/dedup_metrics_size_field.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr3.yaml
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*.java : Document Calcite-specific workarounds in code
Applied to files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteBinCommandIT.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*IT.java : Name integration tests with `*IT.java` suffix in OpenSearch SQL
Applied to files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteBinCommandIT.java
📚 Learning: 2025-12-29T05:32:03.491Z
Learnt from: LantaoJin
Repo: opensearch-project/sql PR: 4993
File: opensearch/src/main/java/org/opensearch/sql/opensearch/planner/physical/CalciteEnumerableTopK.java:20-20
Timestamp: 2025-12-29T05:32:03.491Z
Learning: For any custom Calcite RelNode class (e.g., ones that extend EnumerableLimitSort or other Calcite RelNode types), always override the copy method. If copy is not overridden, cloning/copy operations may downgrade the instance to the parent class type, losing the custom behavior. In your implementation, ensure copy returns a new instance of the concrete class with all relevant fields and traits preserved, mirroring the current instance state.
Applied to files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteBinCommandIT.java
📚 Learning: 2025-12-11T05:27:39.856Z
Learnt from: LantaoJin
Repo: opensearch-project/sql PR: 0
File: :0-0
Timestamp: 2025-12-11T05:27:39.856Z
Learning: In opensearch-project/sql, for SEMI and ANTI join types in CalciteRelNodeVisitor.java, the `max` option has no effect because these join types only use the left side to filter records based on the existence of matches in the right side. The join results are identical regardless of max value (max=1, max=2, or max=∞). The early return for SEMI/ANTI joins before processing the `max` option is intentional and correct behavior.
Applied to files:
integ-test/src/test/resources/expectedOutput/calcite/explain_filter_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q29.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_top_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_multisearch_basic.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_complex_sort_expr_pushdown_for_smj_w_max_option.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_output.yamlinteg-test/src/test/resources/expectedOutput/calcite/clickbench/q28.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_agg_paginating_having3.yaml
📚 Learning: 2025-12-29T05:32:11.893Z
Learnt from: LantaoJin
Repo: opensearch-project/sql PR: 4993
File: opensearch/src/main/java/org/opensearch/sql/opensearch/planner/physical/CalciteEnumerableTopK.java:20-20
Timestamp: 2025-12-29T05:32:11.893Z
Learning: In opensearch-project/sql, when creating custom Calcite RelNode classes that extend EnumerableLimitSort or other Calcite RelNode types, always override the `copy` method. Without overriding copy, the class will downgrade to its parent class type during copy operations, losing the custom implementation.
Applied to files:
integ-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_top_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr3.yaml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (27)
- GitHub Check: security-it-linux (25)
- GitHub Check: security-it-linux (21)
- GitHub Check: build-linux (21, unit)
- GitHub Check: build-linux (25, integration)
- GitHub Check: build-linux (25, unit)
- GitHub Check: build-linux (25, doc)
- GitHub Check: build-linux (21, doc)
- GitHub Check: build-linux (21, integration)
- GitHub Check: bwc-tests-rolling-upgrade (25)
- GitHub Check: bwc-tests-full-restart (25)
- GitHub Check: bwc-tests-rolling-upgrade (21)
- GitHub Check: bwc-tests-full-restart (21)
- GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, unit)
- GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, unit)
- GitHub Check: build-windows-macos (macos-14, 21, unit)
- GitHub Check: build-windows-macos (macos-14, 25, unit)
- GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, integration)
- GitHub Check: build-windows-macos (macos-14, 25, integration)
- GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, integration)
- GitHub Check: build-windows-macos (macos-14, 21, integration)
- GitHub Check: security-it-windows-macos (macos-14, 25)
- GitHub Check: security-it-windows-macos (macos-14, 21)
- GitHub Check: security-it-windows-macos (windows-latest, 21)
- GitHub Check: security-it-windows-macos (windows-latest, 25)
- GitHub Check: test-sql-cli-integration (21)
- GitHub Check: CodeQL-Scan (java)
- GitHub Check: WhiteSource Security Check
🔇 Additional comments (31)
integ-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_top_push.yaml (2)
7-7: LGTM! ExplicitNULLS FIRSTaligns with PR objectives.The addition of
NULLS FIRSTto the window function correctly reflects the PR's goal of preserving null direction during RelNode ↔ SqlNode round trips. This makes the null ordering behavior explicit in the explain plan output, which is consistent with Calcite's validation system behavior.The test data is realistic and representative of the expected Calcite explain output format.
13-13: Physical plan update is consistent with logical plan changes.The physical plan correctly mirrors the logical plan's null ordering semantics. The OpenSearch aggregation query DSL with
"order":[{"_count":"desc"},{"_key":"asc"}]is the appropriate translation for the rare/top operation with the preserved null direction.integ-test/src/test/resources/expectedOutput/calcite/explain_multisearch_basic.yaml (1)
1-23: LGTM! Type annotation simplification aligns with Calcite validation changes.The removal of explicit
:VARCHARannotations from string literals ('young','adult') is consistent with the PR's migration to Calcite's type validation system. The plan semantics remain correct:
- Boundary conditions are properly handled (age < 30 for 'young', age >= 30 for 'adult' with no overlap).
- Filter pushdown to OpenSearch index scans is preserved.
- Aggregation and projection logic is intact.
integ-test/src/test/resources/expectedOutput/calcite/explain_limit_then_filter_push.yaml (2)
7-8: LGTM - LogicalProject addition is expected from SqlNode round-trip.The explicit
LogicalProjectwrapping the index scan is a consequence of the RelNode → SqlNode → RelNode validation round-trip introduced by this PR. The physical plan shows this doesn't impact optimization effectiveness—the push-down still correctly projects only[age]and appliesLIMIT->5at the scan level.
11-11: LGTM - Explicit BIGINT typing reflects Calcite's type coercion.The change from
30to30:BIGINTis the expected result of Calcite's validation system adding explicit type information to literals during coercion. This ensures type-safe comparison with theagefield.integ-test/src/test/resources/expectedOutput/calcite/big5/range_field_conjunction_big_range_big_term_query.yaml (1)
3-10: LGTM!The test data is well-structured and covers the expected behavior for the Calcite integration:
- The logical plan correctly represents the query structure with explicit type-safe range bounds (
>=($28, 1), <=($28, 100)).- The physical plan properly reflects the typed Sarg representation (
Sarg[[1L:BIGINT..100L:BIGINT]]:BIGINT), aligning with the PR's type coercion objectives.- The OpenSearch query builder correctly translates to a boolean query with term and range clauses, including boundary values.
- Edge cases are covered with inclusive range boundaries.
integ-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr3_alternative.yaml (1)
1-11: Test data structure looks correct for dedup expression scenario.The expected output covers:
- Logical plan: Properly demonstrates dedup via
ROW_NUMBER() OVER (PARTITION BY new_gender, new_state)with filter<= 2to keep top 2 rows per partition.- Physical plan: Shows aggregation pushdown to OpenSearch with composite buckets on the partition keys and
top_hitsto retrieve actual documents.A few observations:
- The physical plan on line 11 is extremely long (single line with Base64-encoded scripts), which is typical for these files but makes diff reviews difficult.
- The Base64 payloads encode
LOWERfunction metadata forgender.keywordandstate.keywordfields—consistent with theLOWER($4)andLOWER($7)expressions in the logical plan.Given the PR objectives mention "Investigate dedup optimization failure" as remaining work, this alternative expected output file likely represents an acceptable plan shape during the investigation. Please confirm this test case aligns with the expected behavior after the Calcite validation/coercion changes.
integ-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr3.yaml (1)
1-11: LGTM - Expected output correctly captures dedup-with-expressions pushdown.The test data is realistic and representative:
- Uses typical account index fields (account_number, firstname, balance, gender, state, etc.)
- Validates the ROW_NUMBER-based dedup logic with
keepempty=2(condition<=($19, 2))- Physical plan correctly shows composite aggregation pushdown with LOWER function scripts for
new_genderandnew_statetop_hitssize of 2 andmissing_bucket=falseproperly align with the logical plan's dedup semantics and IS NOT NULL filtersThe base64-encoded script content is expected for Calcite's serialized script handling.
integ-test/src/test/resources/expectedOutput/calcite/big5/range_field_disjunction_big_range_small_term_query.yaml (2)
1-8: LGTM! Logical plan correctly represents the disjunction query.The expected output properly captures the OR condition between a term match and a numeric range. The plan structure (SystemLimit → Sort → Project → Filter → IndexScan) is appropriate for this query pattern.
Minor: Line 1 adds an empty line at the start of the file, which is a minor formatting inconsistency but doesn't affect functionality.
9-10: LGTM! Physical plan correctly pushes down the disjunction to OpenSearch.The pushdown context properly translates the OR condition:
- Calcite's
Sarg[[1L:BIGINT..100L:BIGINT]]representation for the range is appropriate- The OpenSearch DSL correctly uses
bool.shouldto handle the disjunction with bothtermandrangeclauses- Predicate pushdown is successfully demonstrated, which aligns with the PR objectives for Calcite validation integration
This test case effectively covers the edge case of mixed query types (term + range) in a disjunction.
docs/user/ppl/functions/collection.md (2)
165-201: LGTM! Properly documented known issue.The TRANSFORM examples are consistently marked with TODO comments and
ppl ignoreblocks, referencing issue #4972 for tracking. The expected outputs are preserved, which will facilitate re-enabling these examples once the issue is resolved.
818-875: LGTM! Consistent handling of MVMAP examples.All three MVMAP examples are uniformly marked with TODO comments and
ppl ignoreblocks, referencing the same issue #4972. The approach is consistent with the TRANSFORM examples and preserves all expected outputs for future validation.integ-test/src/test/resources/expectedOutput/calcite/big5/range_field_conjunction_small_range_small_term_query.yaml (1)
1-10: LGTM! Test data is realistic and covers expected edge cases.The expected output correctly validates the Calcite validation/coercion pipeline:
- The logical plan reflects the Sort → Project → Filter ordering consistent with PR objectives.
- The filter predicate uses compound range semantics (
AND(>=($28, 10), <=($28, 20))) and OR disjunction.- Field indices in the physical plan (
$13→metrics.size,$15→aws.cloudwatch.log_stream) correctly map to the projected fields.- The
Sarg[[10L:BIGINT..20L:BIGINT]]:BIGINTnotation aligns with explicit type-casting introduced by the validation system.The numeric bounds appearing as
10.0/20.0(floats) in the ES query vs10L:BIGINTin the Sarg is expected behavior for OpenSearch compatibility.integ-test/src/test/resources/expectedOutput/calcite/explain_filter_push.yaml (2)
7-11: Redundant LogicalProject nodes appear in the plan.Lines 7 and 9 contain identical
LogicalProjectnodes projecting all 17 fields. This redundancy likely results from the RelNode → SqlNode → RelNode round-trip introduced by the Calcite validation phase. While functionally correct, this may indicate a missed optimization opportunity.Given the PR TODOs mention "Investigate dedup optimization failure," this is expected behavior for now and should be addressed as part of that follow-up.
Confirm the redundant projections don't impact query performance in production and are handled by downstream optimization passes.
13-13: BIGINT-typed Sarg representation is correct and consistent.The physical plan correctly uses
Sarg[(30L:BIGINT..40L:BIGINT)]:BIGINTfor the age range filter, and the JSONsourceBuildercorrectly translates this to:
"include_lower":false, "include_upper":falsematching the exclusive bounds(30..40)- Balance filter
>($0, 10000)correctly maps to"include_lower":falseThe field indexing after
PROJECT->[balance, age]is also correct:$0= balance,$1= age.integ-test/src/test/resources/expectedOutput/calcite/explain_agg_paginating_having3.yaml (1)
4-13: LGTM! Type coercion changes correctly reflect Calcite validation behavior.The updated expected output properly captures the new Calcite-based type coercion:
Logical plan (line 5): The explicit
CAST(1000):DOUBLE NOT NULLfor comparing againstnew_avg(derived fromAVG, which returnsDOUBLE) is the expected implicit cast insertion from Calcite's type coercion.Physical plan (line 12): Typed literals (
1000.0E0:DOUBLE,1:BIGINT) correctly match the operand types of the comparisons.Consistency: The second comparison
>($4, 1)doesn't require a cast in the logical plan sinceCOUNT()returnsBIGINTand the literal1is compatible (typed as1:BIGINTin the physical plan).Test data is realistic, covers the type coercion edge case for numeric comparisons, and aligns with the PR's objective of migrating to Calcite's validation system.
integ-test/src/test/resources/expectedOutput/calcite/clickbench/q41_alternative.yaml (1)
1-11: Logical plan structure looks correct.The expected logical plan covers realistic analytics query patterns including:
- Pagination with offset/limit and system query size limits
- NULL handling with
IS NOT NULLpredicates- Complex filter conditions with OR clauses and date range comparisons
- Aggregation with COUNT() grouped by multiple fields
- SAFE_CAST usage aligns with the PR's type coercion approach for handling type mismatches
integ-test/src/test/resources/expectedOutput/calcite/explain_dedup_with_expr4_alternative.yaml (2)
3-10: Logical plan structure is correct and consistent.The dedup plan correctly:
- Partitions
ROW_NUMBER()by$4, $5(new_gender, new_state) which are the LOWER expression results- Filters nulls on partition keys before windowing to ensure correct dedup behavior
- Maintains sort order for deterministic dedup results
The column references align properly with the projection schema defined in line 9.
11-12: Physical plan pushdown is consistent with logical plan.The OpenSearchRequestBuilder correctly reflects:
- Composite aggregation on
new_gender/new_statewith LOWER scripts matching the logical projectiontop_hitswithsize:2matching the dedup limit condition<=($6, 2)- Proper
_sourceincludes andscript_fieldsfor computed columnsThe test data is realistic and representative of Calcite's explain output for dedup with expressions.
integ-test/src/test/resources/expectedOutput/calcite/big5/dedup_metrics_size_field.yaml (2)
1-10: LGTM! Logical plan structure is consistent and correctly represents dedup semantics.The column indices are properly aligned:
$28correctly referencesmetrics.sizefor both the partition key and NULL filter$45correctly references_row_number_dedup_for the dedup filter (<= 1)- Outer projection indices match the inner projection field positions
The plan correctly handles the edge case of NULL partition keys with
IS NOT NULL($28)before windowing.
12-13: Physical plan pushdown correctly implements dedup semantics.The OpenSearch aggregation structure is appropriate:
- Composite aggregation with
top_hits(size=1)correctly deduplicates by taking first row permetrics.sizepartition_sourceincludes field list matches the logical projectionmissing_bucket:falsecorrectly excludes NULL partition keys (aligned with theIS NOT NULLfilter in logical plan)integ-test/src/test/resources/expectedOutput/calcite/explain_complex_sort_expr_pushdown_for_smj_w_max_option.yaml (2)
5-9: Field indices correctly reflect added derived field.The field index shifts ([$14]-[$26] and join condition =($13, $20)) are consistent with the addition of the
$f13derived field in line 7. TheCAST(REX_EXTRACT(...)):VARCHARwrapper aligns with the PR's Calcite validation system that automatically inserts type casts during coercion.
20-21: Physical plan correctly propagates CAST to pushdown context.The
EnumerableCalcandCalciteEnumerableIndexScancorrectly incorporate theCAST(REX_EXTRACT(...)):VARCHARin both the projection and theSORT_EXPRpushdown. This ensures the sort-merge join operates on type-coerced values, consistent with the logical plan.integ-test/src/test/resources/expectedOutput/calcite/clickbench/q29.yaml (2)
6-14: Logical plan structure looks correct.The plan properly handles:
- Empty string filtering on
Referer(line 13)- Domain extraction via
REGEXP_REPLACEwith pattern^https?://(?:www\.)?([^/]+)/.*$(line 12)- NULL filtering on computed
kfield (line 11)- Aggregation with
AVG,COUNT,MIN(line 9)- Post-aggregation filtering on count > 100000 (line 7)
The wide projection at line 12 covering all 111 clickbench fields is expected for the hits table schema.
18-19: Physical plan correctly pushes aggregation to OpenSearch.The pushdown context appropriately includes:
- Filter pushdown:
<>($0, '')and existence check on Referer- Composite aggregation for group-by on computed
k(domain extraction)- Sub-aggregations:
avgwithCHAR_LENGTHscript,top_hitsformin(Referer)The base64-encoded Calcite scripts enable server-side evaluation of
REGEXP_REPLACEandCHAR_LENGTHexpressions, which is the expected pattern for complex expression pushdown.integ-test/src/test/resources/expectedOutput/calcite/explain_output.yaml (2)
9-14: LGTM!The logical plan structure is internally consistent. Column references correctly trace through the plan:
- Aggregate outputs
state@0, city@1, avg_age@2- Project reorders to
avg_age@0, state@1, age2@2- Sort on
$1correctly sorts bystateThe test data covers relevant scenarios including aggregation, computed columns (
age2=[+($2, 2)]), NULL handling (ASC-nulls-first,IS NOT NULL), and deduplication with window functions.
21-21: LGTM!The physical plan correctly pushes down the query to OpenSearch:
- Filter
>($2, 30)maps to therangequery onage- Composite aggregation with
stateandcitybuckets matches the logical groupingSORT->[0 ASC FIRST]aligns with"order":"asc"and"missing_order":"first"on the state bucket- The
avg_agemetric aggregation is correctly nestedThe expected output accurately represents the Calcite enumerable index scan path with appropriate pushdown context.
integ-test/src/test/resources/expectedOutput/calcite/clickbench/q28.yaml (3)
6-14: Logical plan structure looks correct for the Calcite validation changes.The restructured plan correctly:
- Reorders projections to place the final column mapping (
l=[$0], c=[$1], CounterID=[$2]) after filtering.- Applies the filter
>($1, 100000)on the aggregated count before final projection.- Expands all 111 fields in the intermediate
LogicalProject(line 12) to preserve field indices during validation.- Maintains consistent field index references:
$103forCounterIDand$26forURL.The plan flow (scan → filter non-empty URL → project → filter non-null CounterID → project subset → aggregate → reproject → filter count → final project) is logically sound.
18-18: ExplicitBIGINTtype annotation reflects Calcite type coercion.The change from an implicit integer literal to
100000:BIGINTis expected behavior from the Calcite validation system, which now performs explicit type annotation during plan generation. This aligns with the PR's objective of migrating type checking/coercion to Calcite's validation system.
19-19: Script payload is valid and properly encoded.The Base64-encoded script decodes successfully to a valid JSON structure that correctly represents the
CHAR_LENGTHoperation with adynamicParam: 0reference. TheDIGESTSarray["URL"]properly maps to this dynamic parameter at index 0, confirming the field binding is correct.docs/user/ppl/cmd/bin.md (1)
503-508: No changes required. The example is already correctly marked as not-yet-implemented with an explicit TODO comment (<!-- TODO: Enable after fixing https://github.com/opensearch-project/sql/issues/4973 -->) and theignoreflag on the code block. This is the proper way to document planned features that are not yet functional.
I have updated a profiling result in the PR description. It's profiled with this script. Based on the result, the validation layer brings no significant performance degration. |
|
Hi @ykmr1224 , do you have any concern over this PR? I foresee that it will bring some impact to the schema-on-read solution, be it in a helpful or negative way. I'm worrying that this PR will make your solution difficult to proceed. |
Signed-off-by: Yuanchun Shen <[email protected]>
Signed-off-by: Yuanchun Shen <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java (3)
184-197: UnusedString.formatargument.The format string has only two placeholders (
%s) but three arguments are provided. The thirdTEST_INDEX_WORK_INFORMATIONargument at line 196 is never used.🔧 Suggested fix
assertYamlEqualsIgnoreId( expected, explainQueryYaml( String.format( "source = %s" + "| eval count_dept = [" + " source = %s | stats count(name)" + " ]" + "| fields name, count_dept", - TEST_INDEX_WORKER, TEST_INDEX_WORK_INFORMATION, TEST_INDEX_WORK_INFORMATION))); + TEST_INDEX_WORKER, TEST_INDEX_WORK_INFORMATION)));
199-212: UnusedString.formatargument.Same issue as above - the format string has only two
%splaceholders but three arguments are provided at line 211.🔧 Suggested fix
assertYamlEqualsIgnoreId( expected, explainQueryYaml( String.format( "source = %s" + "| where id > [" + " source = %s | stats count(name)" + " ] + 999" + "| fields name", - TEST_INDEX_WORKER, TEST_INDEX_WORK_INFORMATION, TEST_INDEX_WORK_INFORMATION))); + TEST_INDEX_WORKER, TEST_INDEX_WORK_INFORMATION)));
2016-2037: Missing@Testannotation - test will not execute.The
testComplexDedupmethod is missing the@Testannotation, so it will not be executed as part of the test suite.🔧 Suggested fix
+ @Test public void testComplexDedup() throws IOException { enabledOnlyWhenPushdownIsEnabled();
🧹 Nitpick comments (4)
integ-test/src/test/resources/expectedOutput/calcite/explain_dedup_rename.yaml (1)
11-12: Physical plan correctly pushes down aggregation and scripted fields.The physical plan appropriately translates to OpenSearch:
- Composite aggregation with
top_hitssize=2 matching the dedup count- Script fields for computed columns with correct DIGESTS referencing source fields (
gender.keyword,state.keyword)- Source includes/excludes properly configured
The single-line format for the physical plan JSON makes diff reviews challenging. Consider using YAML block scalar with literal style and formatting the JSON across multiple lines for improved readability and easier change tracking in future PRs—though this is a low-priority suggestion if exact-match testing requires this format.
integ-test/src/test/java/org/opensearch/sql/util/MatcherUtils.java (1)
469-500: LGTM - well-structured overload following existing patterns.The implementation correctly:
- Validates non-empty input with a clear exception message
- Short-circuits on first successful match
- Propagates the last failure for diagnostic purposes
- Has comprehensive JavaDoc
The pattern is consistent with the existing two-argument overload at lines 460-467.
🔧 Optional: Add null validation for defensive programming
For better error messages when debugging test failures, consider adding a null check:
public static void assertYamlEqualsIgnoreId(List<String> expectedYamls, String actualYaml) { + if (expectedYamls == null) { + throw new IllegalArgumentException("expectedYamls must not be null"); + } if (expectedYamls.isEmpty()) { throw new IllegalArgumentException("expectedYamls list must not be empty"); }This is optional since the existing methods in this file don't perform extensive null checks either, and test utility callers are expected to provide valid inputs.
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java (2)
1811-1818: TODO comment indicates a known regression.The comment "TODO: Fix -- now it gets away from push-down" suggests that push-down optimization is not working as expected for this case. Consider creating an issue to track this.
Would you like me to open a GitHub issue to track this push-down optimization regression for the
chartcommand withuseother=true?
2066-2074: Non-deterministic plan output handling.The use of
List.of()with multiple expected plan files (explain_dedup_expr4.yaml, alternatives) handles legitimate optimizer non-determinism. This is an acceptable pattern, but consider documenting why multiple valid outputs exist (e.g., different but equivalent join orderings, filter placements).
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.javainteg-test/src/test/java/org/opensearch/sql/ppl/ExplainIT.javainteg-test/src/test/java/org/opensearch/sql/util/MatcherUtils.javainteg-test/src/test/resources/expectedOutput/calcite/explain_count_agg_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr4.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr4_alternative1.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_rename.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_with_expr4_alternative2.yaml
🚧 Files skipped from review as they are similar to previous changes (1)
- integ-test/src/test/java/org/opensearch/sql/ppl/ExplainIT.java
🧰 Additional context used
📓 Path-based instructions (6)
**/*.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
**/*.java: UsePascalCasefor class names (e.g.,QueryExecutor)
UsecamelCasefor method and variable names (e.g.,executeQuery)
UseUPPER_SNAKE_CASEfor constants (e.g.,MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
PreferOptional<T>for nullable returns in Java
Avoid unnecessary object creation in loops
UseStringBuilderfor string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.javainteg-test/src/test/java/org/opensearch/sql/util/MatcherUtils.java
⚙️ CodeRabbit configuration file
**/*.java: - Flag methods >50 lines as potentially too complex - suggest refactoring
- Flag classes >500 lines as needing organization review
- Check for dead code, unused imports, and unused variables
- Identify code reuse opportunities across similar implementations
- Assess holistic maintainability - is code easy to understand and modify?
- Flag code that appears AI-generated without sufficient human review
- Verify Java naming conventions (PascalCase for classes, camelCase for methods/variables)
- Check for proper JavaDoc on public classes and methods
- Flag redundant comments that restate obvious code
- Ensure proper error handling with specific exception types
- Check for Optional usage instead of null returns
- Validate proper use of try-with-resources for resource management
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.javainteg-test/src/test/java/org/opensearch/sql/util/MatcherUtils.java
integ-test/**/*IT.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
End-to-end scenarios need integration tests in
integ-test/module
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
⚙️ CodeRabbit configuration file
integ-test/**/*IT.java: - Integration tests MUST use valid test data from resources
- Verify test data files exist in integ-test/src/test/resources/
- Check test assertions are meaningful and specific
- Validate tests clean up resources after execution
- Ensure tests are independent and can run in any order
- Flag tests that reference non-existent indices (e.g., EMP)
- Verify integration tests are in correct module (integ-test/)
- Check tests can be run with ./gradlew :integ-test:integTest
- Ensure proper test data setup and teardown
- Validate end-to-end scenario coverage
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
**/*IT.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
Name integration tests with
*IT.javasuffix in OpenSearch SQL
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
**/test/**/*.java
⚙️ CodeRabbit configuration file
**/test/**/*.java: - Verify NULL input tests for all new functions
- Check boundary condition tests (min/max values, empty inputs)
- Validate error condition tests (invalid inputs, exceptions)
- Ensure multi-document tests for per-document operations
- Flag smoke tests without meaningful assertions
- Check test naming follows pattern: test
- Verify test data is realistic and covers edge cases
- Verify test coverage for new business logic
- Ensure tests are independent and don't rely on execution order
- Validate meaningful test data that reflects real-world scenarios
- Check for proper cleanup of test resources
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.javainteg-test/src/test/java/org/opensearch/sql/util/MatcherUtils.java
**/calcite/**/*.java
⚙️ CodeRabbit configuration file
**/calcite/**/*.java: - Follow existing Calcite integration patterns
- Verify RelNode visitor implementations are complete
- Check RexNode handling follows project conventions
- Validate SQL generation is correct and optimized
- Ensure Calcite version compatibility
- Follow existing patterns in CalciteRelNodeVisitor and CalciteRexNodeVisitor
- Document any Calcite-specific workarounds
- Test compatibility with Calcite version constraints
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
integ-test/src/test/resources/**/*
⚙️ CodeRabbit configuration file
integ-test/src/test/resources/**/*: - Verify test data is realistic and representative
- Check data format matches expected schema
- Ensure test data covers edge cases and boundary conditions
Files:
integ-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr4_alternative1.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_rename.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr4.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_with_expr4_alternative2.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_count_agg_push.yaml
🧠 Learnings (5)
📓 Common learnings
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Update corresponding AST builder classes when making PPL grammar changes
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Follow existing patterns in `CalciteRelNodeVisitor` and `CalciteRexNodeVisitor` for Calcite integration
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes
Applied to files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.javainteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr4_alternative1.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_rename.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr4.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_dedup_with_expr4_alternative2.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_count_agg_push.yaml
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*.java : Document Calcite-specific workarounds in code
Applied to files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*IT.java : Name integration tests with `*IT.java` suffix in OpenSearch SQL
Applied to files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
📚 Learning: 2025-12-29T05:32:03.491Z
Learnt from: LantaoJin
Repo: opensearch-project/sql PR: 4993
File: opensearch/src/main/java/org/opensearch/sql/opensearch/planner/physical/CalciteEnumerableTopK.java:20-20
Timestamp: 2025-12-29T05:32:03.491Z
Learning: For any custom Calcite RelNode class (e.g., ones that extend EnumerableLimitSort or other Calcite RelNode types), always override the copy method. If copy is not overridden, cloning/copy operations may downgrade the instance to the parent class type, losing the custom behavior. In your implementation, ensure copy returns a new instance of the concrete class with all relevant fields and traits preserved, mirroring the current instance state.
Applied to files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (28)
- GitHub Check: build-linux (21, doc)
- GitHub Check: build-linux (21, unit)
- GitHub Check: bwc-tests-rolling-upgrade (21)
- GitHub Check: build-linux (25, unit)
- GitHub Check: build-linux (21, integration)
- GitHub Check: build-linux (25, integration)
- GitHub Check: build-linux (25, doc)
- GitHub Check: bwc-tests-rolling-upgrade (25)
- GitHub Check: bwc-tests-full-restart (21)
- GitHub Check: bwc-tests-full-restart (25)
- GitHub Check: security-it-linux (21)
- GitHub Check: security-it-linux (25)
- GitHub Check: security-it-windows-macos (windows-latest, 25)
- GitHub Check: security-it-windows-macos (windows-latest, 21)
- GitHub Check: security-it-windows-macos (macos-14, 25)
- GitHub Check: security-it-windows-macos (macos-14, 21)
- GitHub Check: build-windows-macos (macos-14, 25, integration)
- GitHub Check: build-windows-macos (macos-14, 25, unit)
- GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, integration)
- GitHub Check: build-windows-macos (macos-14, 21, unit)
- GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, unit)
- GitHub Check: build-windows-macos (macos-14, 21, doc)
- GitHub Check: build-windows-macos (macos-14, 21, integration)
- GitHub Check: build-windows-macos (macos-14, 25, doc)
- GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, integration)
- GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, unit)
- GitHub Check: CodeQL-Scan (java)
- GitHub Check: test-sql-cli-integration (21)
🔇 Additional comments (9)
integ-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr4_alternative1.yaml (1)
1-12: LGTM! Well-structured expected output for dedup-with-expressions scenario.The logical plan correctly models:
- ROW_NUMBER windowing partitioned by computed expressions (
$4,$5=new_gender,new_statefrom LOWER)- Null filtering on partition keys before dedup
- Sort order preservation (ASC-nulls-first, DESC-nulls-last)
The physical plan appropriately shows:
- Composite aggregation with scripted fields for the computed partition keys
- top_hits with size=2 matching the
<= 2dedup filter- Base64-encoded calcite scripts for OpenSearch execution
This alternative1 variant correctly demonstrates dedup partitioning on expression-derived columns.
integ-test/src/test/resources/expectedOutput/calcite/explain_dedup_with_expr4_alternative2.yaml (1)
1-12: LGTM! Correctly represents an alternative dedup optimization path.This alternative2 variant differs from alternative1 in a meaningful way:
- Logical plan:
PARTITION BY $1, $3(originalgender,statefields) vs alternative1's$4, $5(computednew_gender,new_state)- Physical plan: Uses direct field terms (
"field":"gender.keyword") instead of scripted aggregation sourcesThe distinction correctly demonstrates that the optimizer may choose to partition on either the source fields or the computed expressions depending on the query context. Both the logical structure and OpenSearch request payload are consistent with this optimization choice.
integ-test/src/test/resources/expectedOutput/calcite/explain_dedup_rename.yaml (1)
1-10: Logical plan structure is correct and well-formed.The logical plan correctly represents the dedup + rename transformation:
ROW_NUMBER() OVER (PARTITION BY $4, $5)for deduplication partitioning<=($6, 2)filter for allowedDuplicates=2LOWER($4)andLOWER($7)for the rename expressions- Proper null handling with
IS NOT NULLfilters- Sort directions preserved (
ASC-nulls-first,DESC-nulls-last)integ-test/src/test/resources/expectedOutput/calcite/explain_count_agg_push.yaml (2)
1-5: LGTM - Logical plan structure is correct.The logical plan correctly represents a count aggregation scenario:
LogicalSystemLimitenforces the query size limitLogicalAggregate(group=[{}], cnt=[COUNT()])represents a global COUNT aggregationCalciteLogicalIndexScanon the standard test indexThe hierarchy and notation follow Calcite conventions.
6-7: Physical plan correctly represents aggregation pushdown.The OpenSearchRequestBuilder configuration is appropriate for a count aggregation:
size=0correctly indicates no documents should be fetched (aggregation-only)track_total_hits=2147483647(Integer.MAX_VALUE) ensures accurate countingAGGREGATIONandLIMITpushdowns are properly captured in PushDownContextThe
rel#:LogicalAggregate.NONE.[]pattern appears to be ID-normalized for stable test comparisons (consistent withassertYamlEqualsIgnoreIdusage mentioned in the PR).integ-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr4.yaml (1)
6-12: LGTM! Column references are now consistent across the plan.The updated expected output correctly reflects the fix where dedup partitioning and null checks now reference
$1, $3(gender, state) instead of$4, $5(new_gender, new_state). This aligns the PARTITION BY columns with the sort columns used in bothLogicalSystemLimit(line 3) andLogicalSort(line 8), and the physical plan's composite buckets ongender.keywordandstate.keywordare consistent with this change.integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java (3)
32-49: LGTM!The test class properly extends
ExplainIT, enables Calcite mode, and loads the necessary test indices. The structure follows established integration test patterns.
444-482: LGTM!The timechart tests verify the correct calculation formulas for
per_second,per_minute,per_hour, andper_dayfunctions. The expressions correctly include the time unit conversions and TIMESTAMPDIFF calculations.
1306-1331: LGTM - proper resource cleanup pattern.Good use of try-finally blocks to ensure
resetQueryBucketSize()is called even if assertions fail. This pattern is consistently applied throughout the test class for test isolation.
Description
This PR migrates from our custom type checking mechanism to Apache Calcite's native validation system by introducing a SqlNode validation layer. This addresses the lack of a proper SQL validation phase and resolves compatibility issues with user-defined types (UDTs).
This implements Approach 1 described in #3865.
How It Works
The PR introduces a validation layer in the query execution pipeline:
This approach leverages Calcite's robust type system while preserving OpenSearch's custom type semantics through careful type mapping and restoration.
Benefits
sqrt(x)→pow(x, 0.5)) before generating physical plansWork Items / Implementation Details
Core Infrastructure
deriveType,coerceOperandType, and type inference methodsType System Enhancements
SqlTypeNameenum with OpenSearch UDTs through dynamic type mapping:EXPR_TIME→SqlTypeName.TIME) before validationcommonTypeForBinaryComparisonto enable datetime cross-type comparisons (DATE vs TIME, etc.)SAFE_CASTfor string-to-number conversions to tolerate malformatted data; useCASTfor literal numbersFunction & Operator Handling
array_slice,reduce,mvappend, etc.) with proper type inferencejson_extract,json_set, etc.)atanoverloading, percentile approximations)DISTINCT_COUNT_APPROX,COUNT(*)rewriting, etc.)ADDoperator: String concatenation vs numeric additionATAN: Single-operand vs two-operand versionsGEOIP: String overrides due to UDT erasureSqlKindforDIVIDEandMODUDFsQuery Construct Support
RelHintthrough conversions (addedSqlHintforLogicalAggregate)bucket_nullableflagsIN/NOT INwith tuple inputs via row constructor rewritingDialect & Compatibility
LogicalValuesfor empty row generationEdge Cases & Fixes
SAFE_CASTRelToSqlConverteris instantiated per-use (stateful component)Test Fixes
CalcitePPLDateTimeBuiltinFunctionIT: Interval semanticsCalcitePPLBuiltinFunctionIT: LOG function, sarg deserializationCalciteArrayFunctionIT: Type checkers, reduce function inferenceCalciteMathematicalFunctionIT,CalcitePPLAggregationITCalciteBinCommandIT: Timestamp operations, windowed aggregates in GROUP BYCalciteStreamstatsCommandIT: Sort columns, bucket_nullableCalcitePPLJsonBuiltinFunctionIT: String conversionCode Cleanup
PPLFuncImpTableUDFOperandMetadata.wrapUDTinterfaceCalciteFuncSignature,PPLTypeCheckerEnhancedCoalesceinto built-inCoalesceOptimizations
SAFE_CASTon non-string literal numbers (useCASTfor better performance)dedupoptimization issuesPerformance Impact
Profiled on personal laptop, each test runs twice, then averaged.
Conclusion: No significant performance degradation. ClickBench shows slight overhead (+12.3%) during analyze phase due to validation, but big5 and TPCH show improvements, likely from better query optimization enabled by proper type information.
Related Issues
Resolves #4636, resolves #3865
Check List
--signoffor-s.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.